Method for overexpression of zwitterionic polysaccharides

ABSTRACT

The present invention is directed to methods for producing and selecting novel mutant strains of  B. fragilis  that constitutively express a particular capsular polysaccharide or only selected capsular polysaccharides; compositions directed to the novel mutant strains of  B. fragilis  that constitutively express a particular capsular polysaccharide or only selected capsular polysaccharides; improved methods for purification of individual capsular polysaccharides; and compositions directed to novel res02 and inv19 genes and their gene products. Significantly, the present invention provides methods and compositions for overexpressing and purifying immunomodulatory capsular polysaccharide A (PSA) in high yield.

Related Application

This application claims benefit under 35 U.S.C. 35 § 119(e) of U.S.provisional application Ser. No. 60/364,168, filed Mar. 13, 2002, theentire contents of which is hereby incorporated by reference.

GOVERNMENT RIGHTS

This invention was funded in part under National Institutes of HealthGrant No. AI44193. The government may retain certain rights in theinvention.

FIELD OF THE INVENTION

The invention relates to compositions and methods for the production andisolation of capsular polysaccharides which have been reported to haveimmunomodulatory effects. Specifically, the invention relates tocompositions and methods for the production and isolation of capsularpolysaccharide A (PSA) of Bacteroides fragilis (B. fragilis).

BACKGROUND OF THE INVENTION

Capsular polysaccharide A (PSA) of Bacteroides fragilis NCTC9343 hasbeen reported to be an immunomodulator with therapeutic and preventativeapplications. U.S. Pat. Nos. 5,679,654 and 5,700,787; Tzianabos A O etal. (2000) J Biol Chem 275:6733-40. It was recently reported that inaddition to PSA, B. fragilis NCTC9343 synthesizes at least seven othercapsular polysaccharide (PSB-PSH). Krinos C M et al. (2001) Nature414:555-558. It has also recently been reported that expression of sevenof these eight capsular polysaccharides of B. fragilis are variable dueto phase variation dictated by inversion of DNA segments containing thepromoters of each of the polysaccharide biosynthesis loci. Krinos C M etal., supra. The fact that this strain synthesizes so manypolysaccharides makes purification of PSA from this strain verylaborious. This, coupled with the variable expression of PSA due to itsphase variation, often results in a very low yield of PSA followingextensive procedures for its purification. Scaled-up purification of PSAfor preventative or therapeutic applications thus presents technicalobstacles.

SUMMARY OF THE INVENTION

The invention arises in part from the discovery by the present inventorsof methods for controlling the phase variation of seven of the eightknown capsular polysaccharides of B. fragilis, PSA, PSB, PSD, PSE, PSF,PSG, and PSH. It was surprisingly discovered that inactivation of anovel gene, res02 (also denoted mpi, multiple promoter invertase), hasthe effect of locking the invertible promoters of each of the capsularpolysaccharide biosynthesis loci in either an “on” orientation or an“off” orientation. When an invertible promoter of a capsularpolysaccharide is in an on orientation, the promoter istranscriptionally active with respect to the polysaccharide biosynthesislocus associated with the biosynthesis of the capsular polysaccharide.Conversely, when an invertible promoter of a capsular polysaccharide isin an off orientation, the promoter is transcriptionally inactive withrespect to the polysaccharide biosynthesis locus associated with thebiosynthesis of the capsular polysaccharide. By locking the promotersand then selecting for bacterial cells constitutively expressing only aparticular capsular polysaccharide, or expressing only a restricted setof capsular polysaccharides including the particular capsularpolysaccharide, it was discovered according to the instant inventionthat it was now possible to increase the yield of the particularcapsular polysaccharide. Thus, it was discovered according to theinstant invention that it is now possible to generate and select mutantstrains of B. fragilis that constitutively express any combination ofselected capsular polysaccharides and not others. Furthermore, usingthis method it is now possible to generate and select mutant strains ofB. fragilis that constitutively express only one particular capsularpolysaccharide, e.g., PSA, and no other capsular polysaccharide. Becauseeach selected mutant strain is rendered incapable of phase variation,the amount of selected polysaccharide, e.g., PSA, made by each cell isincreased compared to wild type. In addition, because a selected mutantstrain expresses only a single capsular polysaccharide, or it expressesonly a desired combination of capsular polysaccharides, purification ofcapsular polysaccharide is greatly simplified and more efficient.

The invention also relates in part to a second novel gene, inv19,located adjacent to res02 and believed by the applicants also to beinvolved in controlling the expression of capsular polysaccharides in B.fragilis. Unlike res02, inactivation or deletion of inv19 does notappear to result in locking the phase variable promoters of capsularpolysaccharide biosynthesis loci. However, it was discovered accordingto the instant invention that deletion of both res02 and inv19 canproduce a genotype different from res02 deletion alone. For example, inone instance a res02 deletion alone did not express PSC, while a doubleres02/inv19 deletion did express PSC.

As detailed below, the present invention thus includes methods forproducing and selecting novel mutant strains of B. fragilis thatconstitutively express only particular capsular polysaccharides;compositions directed to the novel mutant strains of B. fragilis thatconstitutively express only particular capsular polysaccharides;improved methods for purification of individual capsularpolysaccharides; and novel compositions directed to the res02 and inv19genes and their gene products.

It is believed that the novel res02 polypeptide encoded by the novelres02 gene functions as an invertase for nucleic acid sequences havingspecific structural characteristics. The specific structuralcharacteristics include the presence of inverted repeat sequencesflanking a central sequence, where the inverted repeat sequences have aparticular sequence specifically recognized by the res02 polypeptide.

In one aspect the invention provides an isolated nucleic acid moleculerelated to res02. The nucleic acid according to this aspect includes anucleotide sequence selected from the group consisting of: (a) anucleotide sequence as set forth in SEQ ID NO: 1; (b) a nucleotidesequence encoding a polypeptide as set forth in SEQ ID NO:2; (c) anucleotide sequence which hybridizes under stringent conditions to acomplement of (a) or (b); and (d) a nucleotide sequence complementary toany of (a)-(c). In one embodiment the nucleic acid molecule is anucleotide sequence as set forth in SEQ ID NO:1. In one embodiment thenucleic acid molecule is a nucleotide sequence encoding a polypeptide asset forth in SEQ ID NO:2. A nucleotide sequence encoding a polypeptideas set forth in SEQ ID NO:2 includes a nucleotide sequence that differsfrom the nucleotide sequence as set forth in SEQ ID NO:1 in codonsequence due to degeneracy of the genetic code.

In another aspect the invention provides an isolated nucleic acidmolecule related to res02. The nucleic acid according to this aspectincludes a nucleotide sequence selected from the group consisting of:(a) a nucleotide sequence encoding a polypeptide which is at least about70 percent identical to a polypeptide as set forth in SEQ ID NO:2,wherein the encoded polypeptide has an activity of the polypeptide setforth in SEQ ID NO:2; (b) a nucleotide sequence encoding an allelicvariant of a nucleotide sequence as set forth in SEQ ID NO:1 or (a); (c)a region of the nucleotide sequence of SEQ ID NO:1, (a), or (b) encodinga polypeptide fragment of at least about 9 amino acid residues, whereinthe polypeptide fragment has an activity of the encoded polypeptide asset forth in SEQ ID NO:2, or is antigenic; (d) a region of thenucleotide sequence of SEQ ID NO:1, or any of (a)-(c) comprising afragment of at least about 16 nucleotides; (e) a nucleotide sequencewhich hybridizes under moderately or highly stringent conditions to thecomplement of any of (a)-(d); and (f) a nucleotide sequencecomplementary to any of (a)-(d).

In yet another aspect the invention provides an isolated nucleic acidmolecule related to res02. The nucleic acid according to this aspectincludes a nucleotide sequence selected from the group consisting of:(a) a nucleotide sequence encoding a polypeptide as set forth in SEQ IDNO:2 with at least one conservative amino acid substitution, wherein theencoded polypeptide has an activity of the polypeptide set forth in SEQID NO:2; (b) a nucleotide sequence encoding a polypeptide as set forthin SEQ ID NO:2 with at least one amino acid insertion, wherein theencoded polypeptide has an activity of the polypeptide set forth in SEQID NO:2; (c) a nucleotide sequence encoding a polypeptide as set forthin SEQ ID NO:2 with at least one amino acid deletion, wherein theencoded polypeptide has an activity of the polypeptide set forth in SEQID NO:2; (d) a nucleotide sequence encoding a polypeptide as set forthin SEQ ID NO:2 which has a C- and/or N-terminal truncation, wherein theencoded polypeptide has an activity of the polypeptide set forth in SEQID NO:2; (e) a nucleotide sequence encoding a polypeptide as set forthin SEQ ID NO:2 with at least one modification selected from the groupconsisting of amino acid substitutions, amino acid insertions, aminoacid deletions, C-terminal truncation, and N-terminal truncation,wherein the encoded polypeptide has an activity of the polypeptide setforth in SEQ ID NO:2; (f) a nucleotide sequence of any of (a)-(e)comprising a fragment of at least about 16 nucleotides; (g) a nucleotidesequence which hybridizes under moderately or highly stringentconditions to the complement of any of (a)-(f); and (h) a nucleotidesequence complementary to any of (a)-(e).

In another aspect the invention provides a vector including the res02nucleic acid molecule of any of the aspects above.

In a further aspect the invention provides a host cell including any ofthe res02 vectors above.

In another aspect the invention provides an isolated polypeptide relatedto res02. The isolated polypeptide according to this aspect includes anamino acid sequence as set forth in SEQ ID NO:2.

In another aspect the invention provides an isolated polypeptide relatedto res02. The isolated polypeptide according to this aspect includes anamino acid sequence selected from the group consisting of: (a) an aminoacid sequence for an ortholog of SEQ ID NO:2; (b) an amino acid sequencewhich is at least about 70 percent identical to the amino acid sequenceof SEQ ID NO:2, wherein the polypeptide has an activity of thepolypeptide set forth in SEQ ID NO:2; (c) a fragment of the amino acidsequence set forth in SEQ ID NO:2 comprising at least about 9 amino acidresidues, wherein the fragment has an activity of the polypeptide setforth in SEQ ID NO:2, or is antigenic; and (d) an amino acid sequencefor an allelic variant of the amino acid sequence as set forth in SEQ IDNO:2, (a), or (b).

In yet another aspect the invention provides an isolated polypeptiderelated to res02. The isolated polypeptide according to this aspectincludes an amino acid sequence selected from the group consisting of:(a) the amino acid sequence as set forth in SEQ ID NO:2 with at leastone conservative amino acid substitution, wherein the polypeptide has anactivity of the polypeptide set forth in SEQ ID NO:2; (b) the amino acidsequence as set forth in SEQ ID NO:2 with at least one amino acidinsertion, wherein the polypeptide has an activity of the polypeptideset forth in SEQ ID NO:2; (c) the amino acid sequence as set forth inSEQ ID NO:2 with at least one amino acid deletion, wherein thepolypeptide has an activity of the polypeptide set forth in SEQ ID NO:2;(d) the amino acid sequence as set forth in SEQ ID NO:2 which has a C-and/or N-terminal truncation, wherein the polypeptide has an activity ofthe polypeptide set forth in SEQ ID NO:2; and (e) the amino acidsequence as set forth in SEQ ID NO:2 with at least one modificationselected from the group consisting of amino acid substitutions, aminoacid insertions, amino acid deletions, C-terminal truncation, andN-terminal truncation, wherein the polypeptide has an activity of thepolypeptide set forth in SEQ ID NO:2.

In certain embodiments the isolated polypeptide encoded by any of theforegoing res02 nucleic acid molecules has an activity of thepolypeptide set forth in SEQ ID NO:2. Specifically, in certainembodiments the activity is promoter invertase activity.

The invention in another aspect provides a selective binding agent orfragment thereof which specifically binds the res02 polypeptide of anyof the foregoing aspects. In one embodiment the selective binding agentor fragment thereof specifically binds the polypeptide comprising theamino acid sequence as set forth in SEQ ID NO:2 or a fragment thereof.In one embodiment the selective binding agent is an antibody or fragmentthereof.

In yet another aspect the invention provides a fusion polypeptidecomprising the res02 polypeptide of any of the foregoing aspects, fusedto a heterologous amino acid sequence.

In a further aspect the invention provides an isolated nucleic acidmolecule related to inv19. The nucleic acid according to this aspectincludes a nucleotide sequence selected from the group consisting of:(a) a nucleotide sequence as set forth in SEQ ID NO:3; (b) a nucleotidesequence encoding a polypeptide as set forth in SEQ ID NO:4; (c) anucleotide sequence which hybridizes under stringent conditions to acomplement of (a) or (b); and (d) a nucleotide sequence complementary toany of (a)-(c). In one embodiment the nucleic acid molecule is anucleotide sequence as set forth in SEQ ID NO:3. In one embodiment thenucleic acid molecule is a nucleotide sequence encoding a polypeptide asset forth in SEQ ID NO:4. A nucleotide sequence encoding a polypeptideas set forth in SEQ ID NO:4 includes a nucleotide sequence that differsfrom the nucleotide sequence as set forth in SEQ ID NO:3 in codonsequence due to degeneracy of the genetic code.

In another aspect the invention provides an isolated nucleic acidmolecule related to inv19. The nucleic acid according to this aspectincludes a nucleotide sequence selected from the group consisting of:(a) a nucleotide sequence encoding a polypeptide which is at least about70 percent identical to a polypeptide as set forth in SEQ ID NO:4,wherein the encoded polypeptide has an activity of the polypeptide setforth in SEQ ID NO:4; (b) a nucleotide sequence encoding an allelicvariant of a nucleotide sequence as set forth in SEQ ID NO:3 or (a); (c)a region of the nucleotide sequence of SEQ ID NO:3, (a), or (b) encodinga polypeptide fragment of at least about 9 amino acid residues, whereinthe polypeptide fragment has an activity of the encoded polypeptide asset forth in SEQ ID NO:4, or is antigenic; (d) a region of thenucleotide sequence of SEQ ID NO:3, or any of (a)-(c) comprising afragment of at least about 16 nucleotides; (e) a nucleotide sequencewhich hybridizes under moderately or highly stringent conditions to thecomplement of any of (a)-(d); and (f) a nucleotide sequencecomplementary to any of (a)-(d).

In yet another aspect the invention provides an isolated nucleic acidmolecule related to inv19. The nucleic acid according to this aspectincludes a nucleotide sequence selected from the group consisting of:(a) a nucleotide sequence encoding a polypeptide as set forth in SEQ IDNO:4 with at least one conservative amino acid substitution, wherein theencoded polypeptide has an activity of the polypeptide set forth in SEQID NO:4; (b) a nucleotide sequence encoding a polypeptide as set forthin SEQ ID NO:4 with at least one amino acid insertion, wherein theencoded polypeptide has an activity of the polypeptide set forth in SEQID NO:4; (c) a nucleotide sequence encoding a polypeptide as set forthin SEQ ID NO:4 with at least one amino acid deletion, wherein theencoded polypeptide has an activity of the polypeptide set forth in SEQID NO:4; (d) a nucleotide sequence encoding a polypeptide as set forthin SEQ ID NO:4 which has a C- and/or N-terminal truncation, wherein theencoded polypeptide has an activity of the polypeptide set forth in SEQID NO:4; (e) a nucleotide sequence encoding a polypeptide as set forthin SEQ ID NO:4 with at least one modification selected from the groupconsisting of amino acid substitutions, amino acid insertions, aminoacid deletions, C-terminal truncation, and N-terminal truncation,wherein the encoded polypeptide has an activity of the polypeptide setforth in SEQ ID NO:4; (f) a nucleotide sequence of any of (a)-(e)comprising a fragment of at least about 16 nucleotides; (g) a nucleotidesequence which hybridizes under moderately or highly stringentconditions to the complement of any of (a)-(f); and (h) a nucleotidesequence complementary to any of (a)-(e).

In another aspect the invention provides a vector including the inv19nucleic acid molecule of any of the aspects above.

In a further aspect the invention provides a host cell including any ofthe inv19 vectors above.

In another aspect the invention provides an isolated polypeptide relatedto inv19. The isolated polypeptide according to this aspect includes anamino acid sequence as set forth in SEQ ID NO:4.

In another aspect the invention provides an isolated polypeptide relatedto inv19. The isolated polypeptide according to this aspect includes anamino acid sequence selected from the group consisting of: (a) an aminoacid sequence for an ortholog of SEQ ID NO:4; (b) an amino acid sequencewhich is at least about 70 percent identical to the amino acid sequenceof SEQ ID NO:4, wherein the polypeptide has an activity of thepolypeptide set forth in SEQ ID NO:4; (c) a fragment of the amino acidsequence set forth in SEQ ID NO:4 comprising at least about 9 amino acidresidues, wherein the fragment has an activity of the polypeptide setforth in SEQ ID NO:4, or is antigenic; and (d) an amino acid sequencefor an allelic variant of the amino acid sequence as set forth in SEQ IDNO:4, (a), or (b).

In yet another aspect the invention provides an isolated polypeptiderelated to inv19. The isolated polypeptide according to this aspectincludes an amino acid sequence selected from the group consisting of:(a) the amino acid sequence as set forth in SEQ ID NO:4 with at leastone conservative amino acid substitution, wherein the polypeptide has anactivity of the polypeptide set forth in SEQ ID NO:4; (b) the amino acidsequence as set forth in SEQ ID NO:4 with at least one amino acidinsertion, wherein the polypeptide has an activity of the polypeptideset forth in SEQ ID NO:4; (c) the amino acid sequence as set forth inSEQ ID NO:4 with at least one amino acid deletion, wherein thepolypeptide has an activity of the polypeptide set forth in SEQ ID NO:4;(d) the amino acid sequence as set forth in SEQ ID NO:4 which has a C-and/or N-terminal truncation, wherein the polypeptide has an activity ofthe polypeptide set forth in SEQ ID NO:4; and (e) the amino acidsequence as set forth in SEQ ID NO:4 with at least one modificationselected from the group consisting of amino acid substitutions, aminoacid insertions, amino acid deletions, C-terminal truncation, andN-terminal truncation, wherein the polypeptide has an activity of thepolypeptide set forth in SEQ ID NO:4.

In certain embodiments the isolated polypeptide encoded by any of theforegoing inv19 nucleic acid molecules has an activity of thepolypeptide set forth in SEQ ID NO:4.

The invention in another aspect provides a selective binding agent orfragment thereof which specifically binds the inv19 polypeptide of anyof the foregoing aspects. In one embodiment the selective binding agentor fragment thereof specifically binds the polypeptide comprising theamino acid sequence as set forth in SEQ ID NO:4 or a fragment thereof.In one embodiment the selective binding agent is an antibody or fragmentthereof

In yet another aspect the invention provides a fusion polypeptidecomprising the inv19 polypeptide of any of the foregoing aspects, fusedto a heterologous amino acid sequence.

According to yet another aspect of the invention, a bacterial cell isprovided in which res02 expression is disabled. The expression of res02can be disabled by any of a number of possible mechanisms, including,but not limited to, alteration of the res02 gene sequence by aninsertion or deletion mutation that causes either deletion, truncation,or frameshift of the translated product, or disruption of a res02 geneexpression sequence, such that res02 is not expressed by the cell.

According to yet another aspect of the invention, a bacterial cell isprovided in which inv19 expression is disabled. The expression of inv19can be disabled by any of a number of possible mechanisms, including,but not limited to, alteration of the inv19 gene sequence by aninsertion or deletion mutation that causes either deletion, truncation,or frameshift of the translated product, or disruption of an inv19 geneexpression sequence, such that inv19 is not expressed by the cell.

In another aspect the invention provides a population of bacterial cellsstably expressing a specific capsular polysaccharide, or only a limitedset of capsular polysaccharides including the specific capsularpolysaccharide, selected from the group consisting of: PSA, PSB, PSD,PSE, PSF, PSG, and PSH. In one embodiment the specific capsularpolysaccharide is PSA. In certain embodiments the bacterial cells are B.fragilis, including B. fragilis NCTC9343. In one embodiment thebacterial cells are B. fragilis 9343res02mut44. In another embodimentthe bacterial cells are B. fragilis 9343res02mut2.

The invention further provides a bacterial cell expressing a capsularpolysaccharide selected from the group consisting of: PSA, PSB, PSD,PSE, PSF, PSG, and PSH, wherein a promoter controlling expression of thecapsular polysaccharide is locked on. In one preferred embodiment thecapsular polysaccharide is PSA. In some embodiments promoterscontrolling expression of each and every capsular polysaccharideselected from the group consisting of: PSB, PSD, PSE, PSF, PSG, and PSHare locked off. In some embodiments any one or combination of capsularpolysaccharides selected from the group consisting of: PSA, PSB, PSD,PSE, PSF, PSG, and PSH is not expressed. Also in some embodiments apromoter controlling expression of any one or combination of capsularpolysaccharides selected from the group consisting of: PSA, PSB, PSD,PSE, PSF, PSG, and PSH, wherein said any one or combination of capsularpolysaccharides is not expressed, is locked off. In one embodiment res02expression is disabled. In one embodiment inv19 expression is disabled.In one embodiment res02 expression and inv19 expression are bothdisabled. In one embodiment the bacterial cell according to any of theforegoing aspects is B. fragilis. In one embodiment the bacterial cellaccording to any of the foregoing aspects is B. fragilis NCTC9343. Inone embodiment the bacterial cell is B. fragilis 9343res02mut44. Inanother embodiment the bacterial cell is B. fragilis 9343res02mut2.

According to yet another aspect, the invention provides a method forlocking a phase-variable promoter of a capsular polysaccharidebiosynthesis gene on. The method according to this aspect involvesinactivating res02 in a bacterial cell to lock a phase-variable promoterof a biosynthesis gene for a capsular polysaccharide on.

In a further aspect the invention provides a method for affectingexpression of a capsular polysaccharide biosynthesis gene. The methodaccording to this aspect involves inactivating inv19 in a bacterial cellto affect expression of a biosynthesis gene for a capsularpolysaccharide. In certain embodiments the method further involvesselecting a bacterial cell expressing at least one capsularpolysaccharide selected from the group consisting of: PSA, PSB, PSD,PSE, PSF, PSG, and PSH. In certain embodiments the method furtherinvolves selecting a bacterial cell expressing only one capsularpolysaccharide selected from the group consisting of: PSA, PSB, PSD,PSE, PSF, PSG, and PSH. In one embodiment the one capsularpolysaccharide is PSA.

The invention in another aspect provides an improved method forpurifying PSA from a bacterial cell. The improvement involvesinactivating res02 in the bacterial cell such that a promoter for PSA islocked on or off, and selecting a bacterial cell expressing PSA. In oneembodiment the bacterial cell is B. fragilis. In one embodiment the cellis B. fragilis NCTC9343. In one particular embodiment the bacterial cellis B. fragilis 9343res02mut44. In another particular embodiment thebacterial cell is B. fragilis 9343res02mut2.

The invention according to yet another aspect further provides a methodfor producing a pure capsular polysaccharide. The method according tothis aspect involves growing a population of bacterial cells stablyexpressing a specific capsular polysaccharide, or only a restricted setof capsular polysaccharides including the specific capsularpolysaccharide, selected from the group consisting of: PSA, PSB, PSD,PSE, PSF, PSG, and PSH, and isolating the specific capsularpolysaccharide from the population of bacterial cells to produce a purecapsular polysaccharide. In one embodiment a promoter controllingexpression of any one or combination of capsular polysaccharidesselected from the group consisting of: PSA, PSB, PSD, PSE, PSF, PSG, andPSH, excluding the specific capsular polysaccharide, is locked off. Inone embodiment the specific expressed capsular polysaccharide is PSA. Inone embodiment promoters controlling expression of each and everycapsular polysaccharide selected from the group consisting of: PSB, PSD,PSE, PSF, PSG, and PSH, are locked off. In one embodiment the bacterialcells are B. fragilis. In one embodiment the bacterial cells are B.fragilis NCTC9343. In a particular embodiment the bacterial cells are B.fragilis 9343res02mut44. In another a particular embodiment thebacterial cells are B. fragilis 9343res02mut2.

In yet another aspect the invention provides a method of treating orpreventing inflammatory bowel disease. The method according to thisaspect involves administering to a subject in need of treatment for orprevention of inflammatory bowel disease an effective amount of apopulation of bacterial cells stably expressing a specific capsularpolysaccharide to treat or prevent the inflammatory bowel disease. Inone embodiment the specific capsular polysaccharide is PSA. In oneembodiment the bacterial cells are B. fragilis. In one embodiment thebacterial cells are B. fragilis NCTC9343. In one embodiment thebacterial cells are B. fragilis 9343res02mut44 (mut44). In anotherembodiment the bacterial cells are B. fragilis 9343res02mut2 (mut2).

BRIEF DESCRIPTION OF THE DRAWINGS

FIG. 1 is the nucleotide sequence of res02 (SEQ ID NO:1).

FIG. 2 is the deduced amino acid sequence of res02 (SEQ ID NO:2).

FIG. 3 is the nucleotide sequence of inv19 (SEQ ID NO:3).

FIG. 4 is the deduced amino acid sequence of inv19 (SEQ ID NO:4).

FIG. 5 is a diagram showing the construction of pKGW10 and mutationstrategy used in deleting res02 from the B. fragilis NCTC9343chromosome. The direction of translation of res02 is from right to leftin the figure.

FIG. 6 is the nucleotide sequence of Left Flank, inv19-D1→ through←inv19-D2, 1,958 bp (SEQ ID NO:5).

FIG. 7 is the nucleotide sequence of Right Flank, inv19-D5→ through←inv19-D6, 2,540 bp (SEQ ID NO:6).

FIG. 8 is a composite of Western immunoblot images demonstrating thatmut2 (lane 2) and mut44 (lane 4) express PSA, but not PSB, PSC, PSD,PSE, PSG, or PSH. Small amounts of PSF were expressed by each of thesemutant strains, but not by another res02 mutant, res02mut8. Wild-type(lane 1) and res02mut8 (lane 3) are also shown for comparison.

DETAILED DESCRIPTION

The technology described in this application has utilized the geneticinformation that the inventors have discovered about B. fragilisNCTC9343 to circumvent both the difficulty of purification of onecapsular polysacharide from the other seven polysaccharides, and the lowyield of a given capsular polysaccharide due to the phase variation inits expression. Importantly, the technology described in thisapplication circumvents both the difficulty of purification of PSA fromthe other seven polysaccharides, and the low yield of PSA due to thephase variation in its expression.

In featured aspects of the invention, res02 is inactivated so that phasevariation of the capsular polysaccharides is eliminated. The lack ofres02 activity renders the invertible promoters for the polysaccharidebiosynthesis loci locked in either the on or the off orientation.Selection of particular phenotypes then permits selection of mutantswith the promoter of a selected capsular polysaccharide locked on andthe promoters of other capsular polysaccharides locked off.

The terms “res02”, “res02 gene”, and “res02 nucleic acid molecule” referto a nucleic acid molecule comprising or consisting of a nucleotidesequence as set forth in SEQ ID NO:1, a nucleotide sequence encoding thepolypeptide as set forth in SEQ ID NO:2, and nucleic acid molecules asdefined herein.

The terms “inv19”, “inv19 gene”, and “inv19 nucleic acid molecule” referto a nucleic acid molecule comprising or consisting of a nucleotidesequence as set forth in SEQ ID NO:3, a nucleotide sequence encoding thepolypeptide as set forth in SEQ ID NO:4, and nucleic acid molecules asdefined herein.

The terms “res02”, “res02 polypeptide”, and “res02 gene product” referto a polypeptide comprising the amino acid sequence of SEQ ID NO:2 andrelated polypeptides. Related polypeptides include res02 polypeptidefragments, res02 polypeptide orthologs, res02 polypeptide variants, andres02 polypeptide derivatives, which possess at least one activity ofthe polypeptide as set forth in SEQ ID NO:2.

The terms “inv19”, “inv19 polypeptide”, and “inv19 gene product” referto a polypeptide comprising the amino acid sequence of SEQ ID NO:4 andrelated polypeptides. Related polypeptides include inv19 polypeptidefragments, inv19 polypeptide orthologs, inv19 polypeptide variants, andinv19 polypeptide derivatives, which possess at least one activity ofthe polypeptide as set forth in SEQ ID NO:4.

Methods and Compositions for Overexpressing Capsular Polysaccharides

A bacterial cell in which “res02 expression is disabled” is a viablebacterial cell in which the level of functional res02 polypeptide isnegligible compared to the level of functional res02 polypeptidenormally expressed by the bacterial cell under the same conditions. Theexpression can be disabled by the introduction or presence of a mutationin the res02 gene that results in a nonfunctional res02 gene product orno functional res02 gene product. Such mutations include those involvingat least one missense mutation, nonsense mutation, truncation mutation,insertion mutation, or deletion mutation, or any combination thereof,involving the res02 open reading frame (ORF). In some embodimentsmutations resulting in disabled res02 expression include at least onetruncation, insertion, or deletion mutation, or any combination thereof,involving the res02 promoter. In some embodiments res02 expression isdisabled through manipulation of factors upstream of the expression ofres02. For example, transcriptional activators, signal transduction, andglobal regulators can influence expression of res02. See, for example,Gally D L et al. (1993) J Bacteriol 175:6186-93; Blomfield I C et al.(1993) J Bacteriol 175:27-36; Dorman C J et al. (1987) J Bacteriol169:3840-43; Eisenstein B I et al. (1987) Proc Natl Acad Sci USA84:6506-10. In one embodiment, a bacterial cell in which res02expression is disabled has a chromosomal deletion of the res02 gene.

For example, a bacterial cell in which res02 expression is disabled canbe a cell in which there is a deletion of any one or combination of thefirst, second, third, fourth, fifth, sixth, and so on, nucleotides shownin SEQ ID NO:1. In one embodiment the deletion occurs nearer to the 5′end than to the 3′ end. For example, deletion of the first, second orthird nucleotide shown in SEQ ID NO:1 obliterates the “atg” start codonat position 1 and moves the first “atg” start codon to position 67 ofSEQ ID NO:1, generating an ORF that encodes amino acids 23-197 shown inSEQ ID NO:2. As another example, deletion of the fourth nucleotide shownin SEQ ID NO:1 maintains the original “atg” start codon at position 1but introduces a number of adjacent and nearby stop codons.

As a further example, a bacterial cell in which res02 expression isdisabled can be a cell in which there is an insertion of one or morenucleotides following any one or combination of the first, second,third, fourth, fifth, sixth, and so on, nucleotides shown in SEQ IDNO:1. For example, the insertion can be an insertion of an in-frame stopcodon (taa, tag, tga) in SEQ ID NO:1. As a further example, theinsertion can be a polynucleotide, e.g., a polynucleotide sequence thatencodes an exogenous gene product. In one embodiment the insertionoccurs nearer to the 5′ end than to the 3′ end.

The deletion or insertion can be accomplished by homologousrecombination.

Homologous recombination is a technique originally developed fortargeting genes to induce or correct mutations in transcriptionallyactive genes. Kucherlapati R S (1989) Prog Nucleic Acid Res Mol Biol36:301-10. The basic technique was developed as a method for introducingspecific mutations into specific regions of the mammalian genome (ThomasK R et al. (1986) Cell 44:419-28; Thomas K R et al. (1987) Cell51:503-12; Doetschman T et al. (1988) Proc Natl Acad Sci USA 85:8583-87)or to correct specific mutations within defective genes (Doetschman T etal. (1987) Nature 330:576-78). Exemplary homologous recombinationtechniques are described in U.S. Pat. No. 5,272,071; European PatentNos. 9193051 and 505500; and PCT/US90/07642 (PCT Pub No. WO 91/09955).

Through homologous recombination, the DNA sequence to be inserted intothe genome can be directed to a specific region of the gene of interestby attaching it to targeting DNA. The targeting DNA is a nucleotidesequence that is complementary (homologous) to a region of the genomicDNA. Small pieces of targeting DNA that are complementary to a specificregion of the genome are put in contact with the parental strand duringthe DNA replication process. It is a general property of DNA that hasbeen inserted into a cell to hybridize, and therefore, recombine withother pieces of endogenous DNA through shared homologous regions. Ifthis complementary strand is attached to an oligonucleotide thatcontains a mutation or a different sequence or an additional nucleotide,it too is incorporated into the newly synthesized strand as a result ofthe recombination. As a result of the proofreading function, it ispossible for the new sequence of DNA to serve as the template. Thus, thetransferred DNA is incorporated into the genome.

Homologous recombination can also be used to delete a region of genomicDNA. In this instance the targeting DNA is a nucleotide sequence that iscomplementary (homologous) to regions of the genomic DNA flanking thatwhich is to be deleted. The targeting DNA is usually at least onekilobase (1 kb) long for each flanking region, and more preferably it iscloser to 2 kb for each flanking region. If this complementary strand isput in contact with the parental strand during the DNA replicationprocess, it is incorporated into the newly synthesized strand as aresult of recombination. As a result of the proofreading function, it ispossible for the new sequence of DNA, lacking the original genomicsequence between the homologous flanking regions, to serve as thetemplate. Thus, the transferred DNA is incorporated into the genome.

Deletion by homologous recombination involves double crossover allelicexchange. In this method a first recombination event occurs with oneflank region, and then a second recombination event occurs with theopposite flank. As a result, double crossover homologous exchangeresults in two groups of cells, those which are deletion mutants andthose which are wild type. The determination of whether a particularcell is wild type or a deletion mutant can be accomplished by screeningfor the wild type gene or wild type gene product, or direct screeningfor the chromosomal deletion.

A bacterial cell in which “inv19 expression is disabled” is a viablebacterial cell in which the level of functional inv19 polypeptide isnegligible compared to the level of functional inv19 polypeptidenormally expressed by the bacterial cell under the same conditions. Theexpression can be disabled by the introduction or presence of a mutationin the inv19 gene that results in a nonfunctional inv19 gene product orno functional inv19 gene product. Such mutations include those involvingat least one missense mutation, nonsense mutation, truncation mutation,insertion mutation, or deletion mutation, or any combination thereof,involving the inv19 open reading frame (ORF). In some embodimentsmutations resulting in disabled inv19 expression include at least onetruncation, insertion, or deletion mutation, or any combination thereof,involving the inv19 promoter. In some embodiments inv19 expression isdisabled through manipulation of factors upstream of the expression ofinv19. For example, transcriptional activators, signal transduction, andglobal regulators can influence expression of res02. See, for example,Gally D L et al. (1993) J Bacteriol 175:6186-93; Blomfield I C et al.(1993) J Bacteriol 175:27-36; Dorman C J et al. (1987) J Bacteriol169:3840-43; Eisenstein B I et al. (1987) Proc Natl Acad Sci USA84:6506-10. In one embodiment, a bacterial cell in which inv19expression is disabled has a chromosomal deletion of the inv19 gene.

For example, a bacterial cell in which inv19 expression is disabled canbe a cell in which there is a deletion of any one or combination of thefirst, second, third, fourth, fifth, sixth, and so on, nucleotides shownin SEQ ID NO:3. In one embodiment the deletion occurs nearer to the 5′end than to the 3′ end.

As a further example, a bacterial cell in which inv19 expression isdisabled can be a cell in which there is an insertion of one or morenucleotides following any one or combination of the first, second,third, fourth, fifth, sixth, and so on, nucleotides shown in SEQ IDNO:3. For example, the insertion can be an insertion of an in-frame stopcodon (taa, tag, tga) in SEQ ID NO:3. As a further example, theinsertion can be a polynucleotide, e.g., a polynucleotide sequence thatencodes an exogenous gene product. In one embodiment the insertionoccurs nearer to the 5′ end than to the 3′ end.

The deletion or insertion can be accomplished by homologousrecombination.

A “population of bacterial cells” is a culture of bacterial cells. Theculture can be liquid culture, semi-solid culture, e.g., in gelatin, ora culture on solid medium, e.g., on nutrient-supplemented agar. Examplesof these various types of bacterial culture are well-known by those ofskill in the art. A liquid culture can be in a test tube, flask, rollerbottle, bioreactor, or other suitable container, preferably withcontrolled conditions of temperature, aeration, and agitation. Thevolume of a liquid culture can range from less than 1 mL to 10 L ormore. In one embodiment the population of bacterial cells derives from asingle bacterium, i.e., it is a clone.

A “bacterial cell expressing a capsular polysaccharide” refers to thedetectable phenotype of a bacterial cell with respect to the capsularpolysaccharide. The phenotype can be concordant or discordant withgenotype. For example, a bacterial cell with a promoter for PSE lockedin the on orientation may or may not express PSE. Of course, a bacterialcell with a promoter for PSE locked in the off orientation generallywill not express PSE. The genotype can conveniently be assessed bypolymerase chain reaction, restriction endonuclease digestion, directsequencing, or any other method suitable for determining or inferringthe sequence of a relevant segment of chromosomal DNA or RNA. Thephenotype can conveniently be assessed by Western immunoblot,immunoaffinity, or other suitable assay using antibodies or othercapsular polysaccharide binding agents specific for the particularcapsular polysaccharide to be assayed. The antibodies can be in the formof monospecific or defined specificity antisera, polyclonal antibodies,monoclonal antibodies, polysaccharide-specific binding fragments of theforegoing antibodies, and derivatives thereof.

A “bacterial cell stably expressing a specific capsular polysaccharide”refers to a bacterial cell expressing a particular capsularpolysaccharide without phase variation of its expression. A “populationof bacterial cells stably expressing a specific capsular polysaccharide”refers to a population of bacterial cells expressing a particularcapsular polysaccharide without phase variation of its expression. Incertain embodiments the specific capsular polysaccharide is the onlycapsular polysaccharide expressed by the bacterial cell or by thepopulation of bacterial cells. The absence of phase variation canreflect the inactivation of res02 such that the normally invertiblepromoter for the polysaccharide biosynthesis gene is locked in its onorientation.

A “promoter controlling expression of a capsular polysaccharide” as usedherein refers to a nontranscribed genetic element associated with andcontrolling the transcription of a capsular polysaccharide biosynthesisgene. The capsular polysaccharide biosynthesis gene can occur as part ofa polycistronic capsular polysaccharide biosynthesis locus. A givenpromoter can regulate transcription of one or more capsularpolysaccharide biosynthesis genes within the associated capsularpolysaccharide biosynthesis gene locus. In one embodiment the promoterincludes inverted repeat regions separated by intervening sequence. Inone embodiment the promoter is contained between inverted repeat regionsin the intervening sequence and is subject to inversion such that in oneorientation the promoter is transcriptionally active (“on”) with respectto the capsular polysaccharide biosynthesis gene, while in the oppositeor flipped orientation the promoter is transcriptionally inactive(“off”) with respect to the capsular polysaccharide biosynthesis gene.The inversion of the invertible promoter region between the invertedrepeat regions is subject to control by a sequence-specific enzymetermed a recombinase or invertase. In one embodiment, the recombinase orinvertase is res02 or inv19. The promoters for seven of the eight knowncapsular polysaccharides of B. fragilis, PSA, PSB, PSD, PSE, PSF, PSG,and PSH, have been reported to be flanked by inverted repeat regions andare subject to inversion. Krinos C M et al. (2001) Nature 414:555-58.

The downstream inverted repeat of each the seven capsularpolysaccharides in B. fragilis is shown in Table 1. Sequence shown inbold print represents a consensus res02 recognition sequence necessaryfor promoter inversion. TABLE 1 Characteristics of the inverted repeat(IR) regions upstream of the seven polysaccharide (PS) biosynthesisloci. PS SEQUENCE OF THE BASEPAIRS SEQ ID Locus DOWNSTREAM IR BETWEENIRs NO: PSA acgaacgttttttgaaaca 193 7 PSB acgaacgttttttgaaaca 181 7 PSDtagacgatcgtctattgaaaca 189 8 PSE acgaacgttttttgaaaca 168 7 PSFttaaacgaacgtctattgaaacact 187 9 PSG gttcaaatagacgaacgttt 174 10 PSHacgaacgttttttgaaaca 192 7 PSC None

A “phase-variable promoter of a capsular polysaccharide biosynthesisgene” refers to

A “phase-variable promoter of a capsular polysaccharide biosynthesisgene” refers to a promoter for a capsular polysaccharide biosynthesisgene that, as just described, includes inverted repeat regions separatedby intervening sequence and is subject to inversion such that in oneorientation the promoter is transcriptionally active (“on”) with respectto the capsular polysaccharide biosynthesis genes, while in the oppositeorientation the promoter is transcriptionally inactive (“off”) withrespect to the capsular polysaccharide biosynthesis genes. Phasevariation results in the variable expression of the capsularpolysaccharide in response to the orientation of the invertiblepromoter. This phase variation normally can occur within a given cellover time, for reasons that are not yet understood. Seven of the eightknown capsular polysaccharides of B. fragilis, PSA, PSB, PSD, PSE, PSF,PSG, and PSH, are subject to phase variation. For the eight knowncapsular polysaccharides, the phenotype of a given cell theoreticallycan vary over time among any of 2⁸ (256) phenotypes. However, asdisclosed herein, the phase variation among the capsular polysaccharidescan be fixed or locked through disabling of the promoter inversionmechanism that underlies the phase variation.

A promoter controlling expression of a capsular polysaccharide is“locked on” when the invertible promoter is in its transcriptionallyactive orientation and it cannot invert to the transcriptionallyinactive orientation. The promoter can be locked on because asequence-specific enzyme that normally inverts the promoter is notpresent or is otherwise disabled. Alternatively, the promoter can belocked on because at least one inverted repeat flanking the invertibleregion of the promoter is altered, e.g., deleted, so that the inversionis not possible.

A promoter controlling expression of a capsular polysaccharide is“locked off” when the invertible promoter is in its transcriptionallyinactive orientation and it cannot invert to the transcriptionallyactive orientation. The promoter can be locked off because asequence-specific enzyme that normally inverts the promoter is notpresent or is otherwise disabled. Alternatively, the promoter can belocked off because at least one inverted repeat flanking the invertibleregion of the promoter is altered, e.g., deleted, so that the inversionis not possible.

The phrase “inactivating res02 in a bacterial cell to lock aphase-variable promoter of a biosynthesis gene for a capsularpolysaccharide” refers to any intervention that renders inversion of aphase-variable promoter of a biosynthesis gene for a capsularpolysaccharide of the bacterial cell by res02 gene product impossible.The intervention can typically involve deletion of the res02 gene asdescribed herein. Other interventions are also contemplated, includingintroduction of other res02 gene mutations that result in nonfunctionalres02 gene products, introduction of res02 antisense nucleic acid orother agent that bind to res02 nucleic acid molecule, as well asintroduction into the bacterial cell of agents that can interfere withthe function of otherwise functional res02 gene product.

The phrase “inactivating inv19 in a bacterial cell to lock aphase-variable promoter of a biosynthesis gene for a capsularpolysaccharide” refers to any intervention that renders inversion of aphase-variable promoter of a biosynthesis gene for a capsularpolysaccharide of the bacterial cell by inv19 gene product impossible.The intervention can typically involve deletion of the inv19 gene asdescribed herein. Other interventions are also contemplated, includingintroduction of other inv19 gene mutations that result in nonfunctionalinv19 gene products, introduction of inv19 antisense nucleic acid orother agent that bind to inv19 nucleic acid molecule, as well asintroduction into the bacterial cell of agents that can interfere withthe function of otherwise functional inv19 gene product.

A “pure capsular polysaccharide” refers to a capsular polysaccharide ofthe invention that (1) has been separated from at least about 50 percentof proteins, lipids, carbohydrates, or other materials with which it isnaturally found when total capsular polysaccharide is isolated from thesource cells, and (2) is substantially free of all other capsularpolysaccharides. “Substantially free of all other capsularpolysaccharides” means the capsular polysaccharide of interestrepresents at least 80 percent of all capsular polysaccharides present.Preferably, the capsular polysaccharide of interest represents at least85 percent, more preferably at least 90 percent, even more preferably atleast 95 percent, and most preferably at least 98 percent, of allcapsular polysaccharides present. In one embodiment, a pure capsularpolysaccharide of the present invention is substantially free from anycontaminants that are found in its natural environment that wouldinterfere with its therapeutic, diagnostic, prophylactic or researchuse.

“Inflammatory bowel disease” refers to a group of chronic inflammatorydisorders of unknown or autoimmune cause involving the gastrointestinaltract. These disorders are well known in the medical arts and includetwo principal categories, ulcerative colitis and Crohn's disease.Ulcerative colitis characteristically occurs as continuous lesions inthe colon involving the mucosa alone, whereas Crohn's diseasecharacteristically occurs as discontinuous lesions anywhere in thegastrointestinal tract, involving inflammation of all layers of thebowel wall. While the etiology of inflammatory bowel disease remainsuncertain, there is evidence to suggest there may be an underlyinginfectious etiology, and effective treatments include the use of localor systemic immunomodulatory agents. See Glickman R M in: Harrison'sPrinciples of Internal Medicine, 14^(th) Ed., Fauci A S et al., eds.,New York: McGraw-Hill, 1998, Chapter 286.

It is believed by the inventors that live or viable bacteriaoverexpressing PSA can be used to treat or prevent inflammatory boweldisease. The bowel is normally colonized by bacteria of many species,including B. fragilis. Therefore introduction into the bowel of live orviable bacteria overexpressing PSA is expected to be well tolerated.However, due to the previously described beneficial effects of PSA ininflammatory bowel disease, believed to be related to the ability of PSAto induce the anti-inflammatory cytokine interleukin-10 (IL-10), and theability these bacteria to overexpress PSA, it is expected that theselive or viable bacteria can be introduced into the bowel and thus treatand prevent inflammatory bowel disease. The live or viable bacteriaoverexpressing PSA can be administered to subjects in need of treatmentfor inflammatory bowel disease by mouth or per rectum. Their effect canbe determined by following disease activity in the usual manner.Furthermore, the live or viable bacteria overexpressing PSA can beadministered in conjunction with any other therapeutic agent useful inthe treatment of inflammatory bowel disease, except, of course,antibiotics to which the live or viable bacteria overexpressing PSA aresensitive. Viable bacteria specifically include lyophylized bacteriathat are capable of growth upon their return to suitable conditions.

Res02 and Inv19 Nucleic Acids and Polypeptides

The term “polypeptide allelic variant” refers to one or several possiblenaturally occurring alternate forms of a gene occupying a given locus ona chromosome of an organism or a population of organisms.

The term “isolated nucleic acid molecule” refers to a nucleic acidmolecule of the invention that (1) has been separated from at leastabout 50 percent of proteins, lipids, carbohydrates, or other materialswith which it is naturally found when total nucleic acid is isolatedfrom the source cells, (2) is not linked to all or a portion of apolynucleotide to which the “isolated nucleic acid molecule” is linkedin nature, (3) is operably linked to a polynucleotide to which it is notlinked in nature, or (4) does not occur in nature as part of a largerpolynucleotide sequence. Preferably, the isolated nucleic acid moleculeof the present invention is substantially free from any other nucleicacid molecule(s) or other contaminants that are found in its naturalenvironment that would interfere with its use in polypeptide productionor its therapeutic, diagnostic, prophylactic or research use.

The term “nucleic acid sequence” or “nucleic acid molecule” refers to aDNA or RNA sequence. The term encompasses molecules formed from any ofthe natural bases (adenine, cytosine, guanine, thymine, uracil) as wellas base analogs of DNA and RNA such as, but not limited to,4-acetylcytosine, 8-hydroxy-N-6-methyladenosine, aziridinyl-cytosine,pseudoisocytosine, 5-(carboxyhydroxylmethyl) uracil, 5-fluorouracil,5-bromouracil, 5-carboxymethylaminomethyl-2-thiouracil,5-carboxy-methylaminomethyluracil, dihydrouracil, inosine,N6-iso-pentenyladenine, 1-methyladenine, 1-methylpseudouracil,1-methylguanine, 1-methylinosine, 2,2-dimethylguanine, 2-methyladenine,2-methylguanine, 3-methylcytosine, 5-methylcytosine, N6-methyladenine,7-methylguanine, 5-methylaminomethyluracil,5-methoxyamino-methyl-2-thiouracil, beta-D-mannosylqueosine,5′-methoxycarbonyl-methyluracil, 5-methoxyuracil,2-methylthio-N-6-isopentenyladenine, uracil-5-oxyacetic acidmethylester, uracil-5-oxyacetic acid, oxybutoxosine, pseudouracil,queosine, 2-thiocytosine, 5-methyl-2-thiouracil, 2-thiouracil,4-thiouracil, 5-methyluracil, N-uracil-5-oxyacetic acid methylester,uracil-5-oxyacetic acid, pseudouracil, queosine, 2-thiocytosine,2,6-diaminopurine, and any combination thereof. In some embodiments thenucleic acid sequence or nucleic acid molecule can have a modifiedbackbone, e,g., a backbone characterized by at least one internucleosidelinkage that is other than a phosphodiester bond. In one embodiment themodified backbone includes at least one stabilized internucleosidelinkage, e.g., a phosphorthioate or phosphorodithioate linkage.

The term “vector” is used to refer to any molecule (e.g., nucleic acid,plasmid, or virus) used to transfer coding information to a host cell.

The term “expression vector” refers to a vector that is suitable fortransformation of a host cell and contains nucleic acid sequences thatdirect and/or control the expression of inserted heterologous nucleicacid sequences. Expression includes, but is not limited to, processessuch as transcription, translation, and RNA splicing, if introns arepresent.

The term “operably linked” is used herein to refer to an arrangement ofgene expression sequence and coding sequence wherein transcriptionaland/or translational control elements of the gene expression sequenceand open reading frame of the coding sequence are covalently linked insuch a way as to place the expression or transcription and/ortranslation of the coding sequence under the influence or control of thegene expression sequence. Thus, two DNA sequences are said to beoperably linked if induction of a promoter in the 5′ gene expressionsequence results in the transcription of the coding sequence and if thenature of the linkage between the two DNA sequences does not (1) resultin the introduction of a frame-shift mutation, (2) interfere with theability of the promoter region to direct the transcription of the codingsequence, or (3) interfere with the ability of the corresponding RNAtranscript to be translated into a protein. Thus, a gene expressionsequence would be operably linked to a res02 or inv19 nucleic acidsequence if the gene expression sequence were capable of effectingtranscription of the res02 or inv19 nucleic acid sequence such that theresulting transcript is translated into the desired protein orpolypeptide. A gene expression sequence need not be contiguous with thecoding sequence, so long as it functions correctly. Thus, for example,intervening untranslated yet transcribed sequences can be presentbetween a promoter sequence and the coding sequence and the promotersequence can still be considered “operably linked” to the codingsequence. Furthermore, an enhancer can be upstream or downstream of thecoding sequence, and the enhancer need not be contiguous with the codingsequence.

The term “host cell” is used to refer to a cell which has beentransformed, or is capable of being transformed with a nucleic acidsequence and then of expressing a selected gene of interest. The termincludes the progeny of the parent cell, whether or not the progeny isidentical in morphology or in genetic make-up to the original parent, solong as the selected gene is present.

The term “res02 polypeptide fragment” refers to a polypeptide thatcomprises a truncation at the amino-terminus and/or a truncation at thecarboxyl-terminus of the polypeptide as set forth in SEQ ID NO:2. Theterm “res02 polypeptide fragment” also refers to amino-terminal and/orcarboxyl-terminal truncations of res02 polypeptide orthologs, res02polypeptide derivatives, or res02 polypeptide variants, or toamino-terminal and/or carboxyl-terminal truncations of the polypeptidesencoded by res02 polypeptide allelic variants or res02 polypeptidesplice variants. In cerain embodiments, truncations and/or deletionscomprise about 10 amino acids, or about 20 amino acids, or about 50amino acids, or about 75 amino acids, or about 100 amino acids, or morethan about 100 amino acids. The polypeptide fragments so produced willcomprise contiguous amino acids numbering about 187, or about 177, orabout 147, or about 122, or about 97, or down to about 9, includingevery integer therebetween. Such res02 polypeptide fragments canoptionally comprise an amino-terminal methionine residue. It will beappreciated that such fragments can be used, for example, to generateantibodies to res02 polypeptides.

The term “inv19 polypeptide fragment” refers to a polypeptide thatcomprises a truncation at the amino-terminus and/or a truncation at thecarboxyl-terminus of the polypeptide as set forth in SEQ ID NO:4. Theterm “inv19 polypeptide fragment” also refers to amino-terminal and/orcarboxyl-terminal truncations of inv19 polypeptide orthologs, inv19polypeptide derivatives, or inv19 polypeptide variants, or toamino-terminal and/or carboxyl-terminal truncations of the polypeptidesencoded by inv19 polypeptide allelic variants or inv19 polypeptidesplice variants. In certain embodiments, truncations and/or deletionscomprise about 10 amino acids, or about 20 amino acids, or about 50amino acids, or about 75 amino acids, or about 100 amino acids, or morethan about 100 amino acids. The polypeptide fragments so produced willcomprise contiguous amino acids numbering about 296, or about 286, orabout 256, or about 231, or about 206, or down to about 9, includingevery integer therebetween. Such inv19 polypeptide fragments canoptionally comprise an amino-terminal methionine residue. It will beappreciated that such fragments can be used, for example, to generateantibodies to inv19 polypeptides.

The term “res02 polypeptide ortholog” refers to a polypeptide fromanother species that corresponds to res02 polypeptide amino acidsequence as set forth in SEQ ID NO:2.

The term “inv19 polypeptide ortholog” refers to a polypeptide fromanother species that corresponds to inv19 polypeptide amino acidsequence as set forth in SEQ ID NO:4.

The term “res02 polypeptide variants” refers to res02 polypeptidescomprising amino acid sequences having one or more amino acid sequencesubstitutions, deletions (such as internal deletions and/or res02polypeptide fragments), and/or additions (such as internal additionsand/or res02 fusion polypeptides) as compared to the res02 polypeptideamino acid sequence set forth in SEQ ID NO:2. Variants can be naturallyoccurring (e.g., res02 polypeptide allelic variants and res02polypeptide orthologs) or artificially constructed. Such res02polypeptide variants can be prepared from the corresponding nucleic acidmolecules having a DNA sequence that varies accordingly from the DNAsequence as set forth in SEQ ID NO:1. In certain embodiments, thevariants have from 1 to 3, or from 1 to 5, or from 1 to 10, or from 1 to15, or from 1 to 20, or from 1 to 25, or from 1 to 50, or from 1 to 75,or from 1 to 100, or more than 100 amino acid substitutions, insertions,additions and/or deletions, wherein the substitutions can beconservative, or non-conservative, or any combination thereof.

The term “inv19 polypeptide variants” refers to inv19 polypeptidescomprising amino acid sequences having one or more amino acid sequencesubstitutions, deletions (such as internal deletions and/or inv19polypeptide fragments), and/or additions (such as internal additionsand/or inv19 fusion polypeptides) as compared to the inv19 polypeptideamino acid sequence set forth in SEQ ID NO:4. Variants can be naturallyoccurring (e.g., inv19 polypeptide allelic variants and inv19polypeptide orthologs) or artificially constructed. Such inv19polypeptide variants can be prepared from the corresponding nucleic acidmolecules having a DNA sequence that varies accordingly from the DNAsequence as set forth in SEQ ID NO:3. In certain embodiments, thevariants have from 1 to 3, or from 1 to 5, or from 1 to 10, or from 1 to15, or from 1 to 20, or from 1 to 25, or from 1 to 50, or from 1 to 75,or from 1 to 100, or more than 100 amino acid substitutions, insertions,additions and/or deletions, wherein the substitutions can beconservative, or non-conservative, or any combination thereof.

The term “res02 polypeptide derivatives” refers to the polypeptide asset forth in SEQ ID NO:2, res02 polypeptide fragments, res02 polypeptideorthologs, or res02 polypeptide variants, as defined herein, that havebeen chemically modified. The term “res02 polypeptide derivatives” alsorefers to the polypeptides encoded by res02 polypeptide allelicvariants, as defined herein, that have been chemically modified.

The term “inv19 polypeptide derivatives” refers to the polypeptide asset forth in SEQ ID NO:4, inv19 polypeptide fragments, inv19 polypeptideorthologs, or inv19 polypeptide variants, as defined herein, that havebeen chemically modified. The term “inv19 polypeptide derivatives” alsorefers to the polypeptides encoded by inv19 polypeptide allelicvariants, as defined herein, that have been chemically modified.

The term “res02 fusion polypeptide” refers to a fusion of one or moreamino acids (such as a heterologous protein or peptide) at the amino- orcarboxyl-terminus of the polypeptide as set forth in SEQ ID NO:2, res02polypeptide fragments, res02 polypeptide orthologs, res02 polypeptidevariants, or res02 derivatives, as defined herein. The term “res02fusion polypeptide” also refers to a fusion of one or more amino acidsat the amino- or carboxyl-terminus of the polypeptide encoded by res02polypeptide allelic variants or res02 polypeptide splice variants, asdefined herein.

The term “inv19 fusion polypeptide” refers to a fusion of one or moreamino acids (such as a heterologous protein or peptide) at the amino- orcarboxyl-terminus of the polypeptide as set forth in SEQ ID NO:4, inv19polypeptide fragments, inv19 polypeptide orthologs, inv19 polypeptidevariants, or inv19 derivatives, as defined herein. The term “inv19fusion polypeptide” also refers to a fusion of one or more amino acidsat the amino- or carboxyl-terminus of the polypeptide encoded by inv19polypeptide allelic variants or inv19 polypeptide splice variants, asdefined herein.

The term “biologically active res02 polypeptides” refers to res02polypeptides having at least one activity characteristic of thepolypeptide comprising the amino acid sequence of SEQ ID NO:2. In oneembodiment the activity is promoter invertase activity. In addition, ares02 polypeptide can be active as an immunogen; that is, the res02polypeptide contains at least one epitope to which antibodies may beraised.

The term “biologically active inv19 polypeptides” refers to inv19polypeptides having at least one activity characteristic of thepolypeptide comprising the amino acid sequence of SEQ ID NO:4. Inaddition, an inv19 polypeptide can be active as an immunogen; that is,the inv19 polypeptide contains at least one epitope to which antibodiesmay be raised.

The term “isolated polypeptide” refers to a polypeptide of the presentinvention that (1) has been separated from at least about 50 percent ofpolynucleotides, lipids, carbohydrates, or other materials with which itis naturally found when isolated from the source cell, (2) is not linked(by covalent or noncovalent interaction) to all or a portion of apolypeptide to which the “isolated polypeptide” is linked in nature, (3)is operably linked (by covalent or noncovalent interaction) to apolypeptide with which it is not linked in nature, or (4) does not occurin nature. Preferably, the isolated polypeptide is substantially freefrom any other polypeptides or other contaminants that are found in itsnatural environment that would interfere with its therapeutic,diagnostic, prophylactic or research use.

The term “identity,” as known in the art, refers to a relationshipbetween the sequences of two or more polypeptide molecules or two ormore nucleic acid molecules, as determined by comparing the sequences.In the art, “identity” also means the degree of sequence relatednessbetween nucleic acid molecules or polypeptides, as the case may be, asdetermined by the match between strings of two or more nucleotide or twoor more amino acid sequences. “Identity” measures the percent ofidentical matches between the smaller of two or more sequences with gapalignments (if any) addressed by a particular mathematical model orcomputer program (i.e., “algorithms”). Percent identity with respect tonucleic acid molecules can be conveniently determined using a sequencealignment algorithm or computer program such as GAP, BLASTN, FASTA,BLASTA, BLASTX, BestFit, and the Smith-Waterman algorithm. Percentidentity with respect to polypeptides can be conveniently determinedusing a sequence alignment algorithm or computer program such as GAP,BLASTP, FASTA, BLASTA, BLASTX, BestFit, and the Smith-Watermanalgorithm.

The term “similarity” is a related concept, but in contrast to“identity,” “similarity” refers to a measure of relatedness whichincludes both identical matches and conservative substitution matches.If two polypeptide sequences have, for example, 10/20 identical aminoacids, and the remainder are all non-conservative substitutions, thenthe percent identity and similarity would both be 50%. If in the sameexample, there are five more positions where there are conservativesubstitutions, then the percent identity remains 50%, but the percentsimilarity would be 75% (15/20). Therefore, in cases where there areconservative substitutions, the percent similarity between twopolypeptides will be higher than the percent identity between those twopolypeptides.

Except as used in connection with the term “isolated”, the term“naturally occurring” or “native” when used in connection withbiological materials such as nucleic acid molecules, polypeptides, hostcells, and the like, refers to materials as they are found in nature andnot manipulated by man. When used in connection with the term“isolated”, the term “naturally occurring” or “native” when used inconnection with biological materials such as nucleic acid molecules,polypeptides, host cells, and the like, refers to isolated materialssharing wild type structural features as they are found in nature. Forexample, an isolated naturally occuring res02 polypeptide is in oneembodiment an isolated wild type res02 polypeptide having the sequenceof SEQ ID NO:2. The term “non-naturally occurring” or “non-native” asused herein refers to a material that is not found in nature or that hasbeen structurally modified or synthesized by man.

The term “effective amount” refers to that amount of a substance that isuseful in accomplishing the purpose for which it is used. The term“therapeutically effective amount” refers to that amount of a substancethat is useful in accomplishing the therapeutic effect for which it isused. Thus a therapeutically effective amount of a substance is thatamount of the substance that, when administered to a subject to treat orprevent a condition or disease of the subject, prevents the onset of,alleviates the symptoms of, or stops the progression of the condition ordisease of the subject.

The term “pharmaceutically acceptable carrier” or “physiologicallyacceptable carrier” as used herein refers to one or more formulationmaterials suitable for accomplishing or enhancing the delivery of anagent as a pharmaceutical composition.

The term “antigen” refers to a molecule or a portion of a moleculecapable of being bound by a selective binding agent, such as anantibody, and additionally capable of being used in an animal to produceantibodies capable of binding to an epitope of that antigen. An antigencan have one or more epitopes.

The term “selective binding agent” refers to a molecule or moleculeshaving specificity for an antigen, e.g., a res02 or inv19 polypeptide.As used herein, the terms “specific” and “specificity” refer to theability of the selective binding agents to bind to an antigen ofinterest and not to bind to other antigens.

The term “transduction” is used to refer to the transfer of genes fromone bacterium to another, usually by a phage.

The term “transfection” is used to refer to the uptake of foreign orexogenous DNA by a cell, and a cell has been “transfected” when theexogenous DNA has been introduced inside the cell membrane. A number oftransfection techniques are well known in the art and are disclosedherein. See, e.g., Sambrook et al, Molecular Cloning: A LaboratoryManual (2nd ed., Cold Spring Harbor Laboratory Press, 1989). Suchtechniques can be used to introduce one or more exogenous DNA moietiesinto suitable host cells.

The term “transformation” as used herein refers to a change in a cell'sgenetic characteristics, and a cell has been transformed when it hasbeen modified to contain a new DNA. For example, a cell is transformedwhere it is genetically modified from its native state. Followingtransfection or transduction, the transforming DNA can recombine withthat of the cell by physically integrating into a chromosome of thecell, can be maintained transiently as an episomal element without beingreplicated, or can replicate independently as a plasmid. A cell isconsidered to have been stably transformed when the DNA is replicatedwith the division of the cell.

Relatedness of Nucleic Acid Molecules and/or Polypeptides

It is understood that related nucleic acid molecules include allelicvariants of the nucleic acid molecule of either SEQ ID NO:1 or SEQ IDNO:3, and include sequences which are complementary to any of the abovenucleotide sequences. Related nucleic acid molecules also include anucleotide sequence encoding a polypeptide comprising or consistingessentially of a substitution, modification, addition and/or deletion ofone or more amino acid residues compared to the polypeptide in eitherSEQ ID NO:2 or SEQ ID NO:4.

Related nucleic acid molecules also include fragments of res02 or inv19nucleic acid molecules which encode a polypeptide of at least about 9contiguous amino acids, or about 50 amino acids, or about 75 aminoacids, or about 100 amino acids, or about 150 amino acids, or about 200amino acids, or more than about 200 amino acid residues of the res02 orinv19 polypeptide of either SEQ ID NO:2 or SEQ ID NO:4.

In addition, related res02 or inv19 nucleic acid molecules also includethose molecules which comprise nucleotide sequences which hybridizeunder moderately or highly stringent conditions as defined herein withthe fully complementary sequence of the res02 or inv19 nucleic acidmolecule of either SEQ ID NO:1 or SEQ ID NO:3, respectively, or of amolecule encoding a polypeptide, which polypeptide comprises the aminoacid sequence as shown in either SEQ ID NO:2 or SEQ ID NO:4, or of anucleic acid fragment as defined herein, or of a nucleic acid fragmentencoding a polypeptide as defined herein. Hybridization probes can beprepared using the res02 or inv19 sequences provided herein to screencDNA, genomic or synthetic DNA libraries for related sequences. Regionsof the DNA and/or amino acid sequence of res02 or inv19 polypeptide thatexhibit significant identity to known sequences are readily determinedusing sequence alignment algorithms as described herein and thoseregions can be used to design probes for screening.

The term “nucleic acid fragment” refers to a nucleic acid moleculesharing in common with a sequence from which it derives any sequence ofat least 16 contiguous bases that is at least one base shorter than thesequence from which it derives. The nucleic acid fragment can be 16, 17,18, 19, 20, and so on, bases long, up to but not including the totalnumber of bases of the sequence from which it derives. For example, insome embodiments a nucleic acid fragment of SEQ ID NO:1 is a 16-merdefined by bases 1-16, 2-17, 3-18, 4-19, and so on through bases 579-594of SEQ ID NO:1. As a further example, in some embodiments a nucleic acidfragment of SEQ ID NO:1 is a 17-mer defined by bases 1-17, 2-18, 4-120,and so on through bases 578-594 of SEQ ID NO:1. In certain preferredembodiments a nucleic acid fragment encodes an immunogenic peptide of acorresponding polypeptide encoded by the full length nucleic acidsequence from which the fragment is derived. Since immunogenic peptidesare generally recognized to be at least 9 amino acids long, thepreferred nucleic acid fragments encoding the immunogenic peptides areat least 27 bases long. Nucleic acid fragments of the invention are alsouseful as primers for polymerase chain reaction amplification of targetDNA and for use as probes in DNA hybridization. In some embodiments anucleic acid fragment further contains noncontiguous sequence, flankedby contiguous sequence. For example, the fragment can add at least onebase, delete at least one base, or change at least one base, and stillbe useful as a primer or probe. Such methods are well known in the artfor the purposes of introducing a restriction endonuclease site to a PCRamplification product, introducing a mutation within a specificsequence, etc.

The term “highly stringent conditions” refers to those conditions thatare designed to permit hybridization of DNA strands whose sequences arehighly complementary, and to exclude hybridization of significantlymismatched DNAs. Hybridization stringency is principally determined bytemperature, ionic strength, and the concentration of denaturing agentssuch as formamide. Examples of “highly stringent conditions” forhybridization and washing are 0.015 M sodium chloride, 0.0015 M sodiumcitrate at 65-68° C. or 0.015 M sodium chloride, 0.0015 M sodiumcitrate, and 50% formamide at 42° C. See Sambrook et al., MolecularCloning: A Laboratory Manual (2nd ed., Cold Spring Harbor LaboratoryPress, 1989); Anderson et al., Nucleic Acid Hybridisation: A PracticalApproach Ch. 4 (IRL Press Limited).

More stringent conditions (such as higher temperature, lower ionicstrength, higher formamide, or other denaturing agent) can also be used,although the rate of hybridization will be affected. Other agents can beincluded in the hybridization and washing buffers for the purpose ofreducing non-specific and/or background hybridization. Examples are 0.1%bovine serum albumin, 0.1% polyvinyl-pyrrolidone, 0.1% sodiumpyrophosphate, 0.1% sodium dodecylsulfate (SDS), ficoll, Denhardt'ssolution, sonicated salmon sperm DNA (or another non-complementary DNA),and dextran sulfate, although other suitable agents can also be used.The concentration and types of these additives can be changed withoutsubstantially affecting the stringency of the hybridization conditions.Hybridization experiments are usually carried out at pH 6.8-7.4;however, at typical ionic strength conditions, the rate of hybridizationis nearly independent of pH. See Anderson et al., Nucleic AcidHybridisation: A Practical Approach Ch. 4 (IRL Press Limited).

Factors affecting the stability of DNA duplex include base composition,length, and degree of base pair mismatch. Hybridization conditions canbe adjusted by one skilled in the art in order to accommodate thesevariables and allow DNAs of different sequence relatedness to formhybrids. The melting temperature of a perfectly matched DNA duplex canbe estimated by the following equation:T_(n)(° C.)=81.5+16.6(log [Na⁺])+0.41(%G+C)−600/N−0.72(%formamide)where N is the length of the duplex formed, [Na⁺] is the molarconcentration of the sodium ion in the hybridization or washingsolution, and % G+C is the percentage of (guanine+cytosine) bases in thehybrid. For imperfectly matched hybrids, the melting temperature isreduced by approximately 1° C. for each 1% mismatch.

The term “moderately stringent conditions” refers to conditions underwhich a DNA duplex is able to form with a greater degree of base pairmismatching than could occur under “highly stringent conditions”.Examples of typical “moderately stringent conditions” are 0.015 M sodiumchloride, 0.0015 M sodium citrate at 50-65° C. or 0.015 M sodiumchloride, 0.0015 M sodium citrate, and 20% formamide at 37-50° C. By wayof example, “moderately stringent conditions” of 50° C. in 0.015 Msodium ion will allow about a 21% mismatch.

It will be appreciated by those skilled in the art that there is noabsolute distinction between “highly stringent conditions” and“moderately stringent conditions.” For example, at 0.015 M sodium ion(no formamide), the melting temperature of perfectly matched long DNA isabout 71° C. With a wash at 65° C. (at the same ionic strength), thiswould allow for approximately a 6% mismatch. To capture more distantlyrelated sequences, one skilled in the art can simply lower thetemperature or raise the ionic strength.

A good estimate of the melting temperature in 1M NaCl, e.g., in 6× saltsodium citrate (SSC), for oligonucleotide probes up to about 20nucleotides is given by:T_(m)=2° C. per A-T base pair+4° C. per G-C base pair

High stringency washing conditions for oligonucleotides are usually at atemperature of 0-5° C. below the T_(m) of the oligonucleotide in 6×SSC,0.1% SDS.

In another embodiment, related nucleic acid molecules comprise orconsist of a nucleotide sequence that is at least about 70 percentidentical to the nucleotide sequence as shown in either SEQ ID NO:1 orSEQ ID NO:3, or comprise or consist essentially of a nucleotide sequenceencoding a polypeptide that is at least about 70 percent identical tothe polypeptide as set forth in either SEQ ID NO:2 or SEQ ID NO:4. Inceertain embodiments, the nucleotide sequences are about 75 percent, orabout 80 percent, or about 85 percent, or about 90 percent, or about 95,96, 97, 98, or 99 percent identical to the nucleotide sequence as shownin either SEQ ID NO:1 or SEQ ID NO:3, or the nucleotide sequences encodea polypeptide that is about 75 percent, or about 80 percent, or about 85percent, or about 90 percent, or about 95, 96, 97, 98, or 99 percentidentical to the polypeptide sequence as set forth in either SEQ ID NO:2or SEQ ID NO:4. Related nucleic acid molecules encode polypeptidespossessing at least one activity of the polypeptide set forth in eitherSEQ ID NO:2 or SEQ ID NO:4.

Differences in the nucleic acid sequence can result in conservativeand/or non-conservative modifications of the amino acid sequencerelative to the amino acid sequence of either SEQ ID NO:2 or SEQ IDNO:4. Conservative modifications to the amino acid sequence of eitherSEQ ID NO:2 or SEQ ID NO:4 (and the corresponding modifications to theencoding nucleotides) will produce a polypeptide having functional andchemical characteristics similar to those of res02 or inv19polypeptides. In contrast, substantial modifications in the functionaland/or chemical characteristics of res02 or inv19 polypeptides can beaccomplished by selecting substitutions in the amino acid sequence ofeither SEQ ID NO:2 or SEQ ID NO:4 that differ significantly in theireffect on maintaining (a) the structure of the molecular backbone in thearea of the substitution, for example, as a sheet or helicalconformation, (b) the charge or hydrophobicity of the molecule at thetarget site, or (c) the bulk of the side chain.

For example, a “conservative amino acid substitution” can involve asubstitution of a native amino acid residue with a normative residuesuch that there is little or no effect on the polarity or charge of theamino acid residue at that position. Furthermore, any native residue inthe polypeptide can also be substituted with alanine, as has beenpreviously described for “alanine scanning mutagenesis”. Conservativeamino acid substitutions also encompass non-naturally occurring aminoacid residues that are typically incorporated by chemical peptidesynthesis rather than by synthesis in biological systems. These includepeptidomimetics, and other reversed or inverted forms of amino acidmoieties.

Naturally occurring residues can be divided into the following classesbased on common side chain properties: (1) hydrophobic: norleucine, Met,Ala, Val, Leu, Ile; (2) neutral hydrophilic: Cys, Ser, Thr; (3) acidic:Asp, Glu; (4) basic: Asn, Gln, His, Lys, Arg; (5) residues thatinfluence chain orientation: Gly, Pro; and (6) aromatic: Trp, Tyr, Phe.

Thus, for example, conservative amino acid substitutions can involve theexchange of a member from one of these classes for another member fromthe same class. Non-conservative amino acid substitutions can involvethe exchange of a member of one of these classes for a member fromanother class.

In making such changes, the hydropathy index of amino acids can beconsidered. Each amino acid has been assigned a hydropathy index on thebasis of its hydrophobicity and charge characteristics. The hydropathyindices are: isoleucine (+4.5); valine (+4.2); leucine (+3.8);phenylalanine (+2.8); cysteine/cystine (+2.5); methionine (+1.9);alanine (+1.8); glycine (−0.4); threonine (−0.7); serine (−0.8);tryptophan (−0.9); tyrosine (−1.3); proline (−1.6); histidine (−3.2);glutamate (−3.5); glutamine (−3.5); aspartate (−3.5); asparagine (−3.5);lysine (−3.9); and arginine (−4.5).

The importance of the amino acid hydropathy index in conferringinteractive biological function on a protein is generally understood inthe art. Kyte J et al. (1982) J Mol Biol 157:105-32. It is known thatcertain amino acids can be substituted for other amino acids having asimilar hydropathy index or score and still retain a similar biologicalactivity. In making changes based upon the hydropathy index, thesubstitution of amino acids whose hydropathy indices are within +2 ispreferred, those which are within +1 are particularly preferred, andthose within ±0.5 are even more particularly preferred.

It is also understood in the art that the substitution of like aminoacids can be made effectively on the basis of hydrophilicity,particularly where the biologically functionally equivalent protein orpeptide thereby created is intended for use in immunologicalembodiments, as in the present case. The greatest local averagehydrophilicity of a protein, as governed by the hydrophilicity of itsadjacent amino acids, correlates with its immunogenicity andantigenicity, i.e., with a biological property of the protein.

The following hydrophilicity values have been assigned to these aminoacid residues: arginine (+3.0); lysine (+3.0); aspartate (+3.0±1);glutamate (+3.0±1); serine (+0.3); asparagine (+0.2); glutamine (+0.2);glycine (0); threonine (−0.4); proline (−0.5±1); alanine (−0.5);histidine (−0.5); cysteine (−1.0); methionine (−1.3); valine (−1.5);leucine (−1.8); isoleucine (−1.8); tyrosine (−2.3); phenylalanine(−2.5); and tryptophan (−3.4). In making changes based upon similarhydrophilicity values, the substitution of amino acids whosehydrophilicity values are within ±2 is preferred, those which are within±1 are particularly preferred, and those within ±0.5 are even moreparticularly preferred. One can also identify epitopes from primaryamino acid sequences on the basis of hydrophilicity. These regions arealso referred to as “epitopic core regions”.

Desired amino acid substitutions (whether conservative ornon-conservative) can be determined by those skilled in the art at thetime such substitutions are desired. For example, amino acidsubstitutions can be used to identify important residues of the res02 orinv19 polypeptide, or to increase or decrease the affinity of the res02or inv19 polypeptides described herein. Exemplary amino acidsubstitutions are set forth in Table 2. TABLE 2 Amino AcidSubstitutions. ORIGINAL PREFERRED RESIDUES EXEMPLARY SUBSTITUTIONSSUBSTITUTIONS Ala Val, Leu, Ile Val Arg Lys, Gln, Asn Lys Asn Gln GlnAsp Glu Glu Cys Ser, Ala Ser Gln Asn Asn Glu Asp Asp Gly Pro, Ala AlaHis Asn, Gln, Lys, Arg Arg Ile Leu, Val, Met, Ala, Phe, Norleucine LeuLeu Ile, Val, Met, Ala, Phe, Norleucine Ile Lys Arg, 1,4 Diamino-butyricAcid, Arg Gln, Asn Met Leu, Phe, Ile Leu Phe Leu, Val, Ile, Ala, Tyr LeuPro Ala, Gly Gly Ser Thr, Ala, Cys Thr Thr Ser Ser Trp Tyr, Phe Tyr TyrTrp, Phe, Thr, Ser Phe Val Ile, Met, Leu, Phe, Ala, Norleucine Leu

A skilled artisan will be able to determine suitable variants of thepolypeptide as set forth in either SEQ ID NO:2 or SEQ ID NO:4 usingwell-known techniques. For identifying suitable areas of the moleculethat can be changed without destroying biological activity, one skilledin the art can target areas not believed to be important for activity.For example, when similar polypeptides with similar activities from thesame species or from other species are known, one skilled in the art cancompare the amino acid sequence of a res02 or inv19 polypeptide to suchsimilar polypeptides. With such a comparison, one can identify residuesand portions of the molecules that are conserved among similarpolypeptides. It will be appreciated that changes in areas of the res02or inv19 molecule that are not conserved relative to such similarpolypeptides would be less likely to adversely affect the biologicalactivity and/or structure of a res02 or inv19 polypeptide. One skilledin the art would also know that, even in relatively conserved regions,one can substitute chemically similar amino acids for the naturallyoccurring residues while retaining activity (conservative amino acidresidue substitutions). Therefore, even areas that may be important forbiological activity or for structure can be subject to conservativeamino acid substitutions without destroying the biological activity orwithout adversely affecting the polypeptide structure.

Additionally, one skilled in the art can review structure-functionstudies identifying residues in similar polypeptides that are importantfor activity or structure. In view of such a comparison, one can predictthe importance of amino acid residues in a res02 or inv19 polypeptidethat correspond to amino acid residues that are important for activityor structure in similar polypeptides. One skilled in the art can opt forchemically similar amino acid substitutions for such predicted importantamino acid residues of res02 or inv19 polypeptides.

In other embodiments, related nucleic acid molecules comprise or consistof a nucleotide sequence encoding a polypeptide as set forth in eitherSEQ ID NO:2 or SEQ ID NO:4 with at least one amino acid insertion andwherein the polypeptide has an activity of the polypeptide set forth ineither SEQ ID NO:2 or SEQ ID NO:4, or a nucleotide sequence encoding apolypeptide as set forth in either SEQ ID NO:2 or SEQ ID NO:4 with atleast one amino acid deletion and wherein the polypeptide has anactivity of the polypeptide set forth in either SEQ ID NO:2 or SEQ IDNO:4. Related nucleic acid molecules also comprise or consist of anucleotide sequence encoding a polypeptide as set forth in either SEQ IDNO:2 or SEQ ID NO:4 wherein the polypeptide has a carboxyl- and/oramino-terminal truncation and further wherein the polypeptide has anactivity of the polypeptide set forth in either SEQ ID NO:2 or SEQ IDNO:4. Related nucleic acid molecules also comprise or consist of anucleotide sequence encoding a polypeptide as set forth in either SEQ IDNO:2 or SEQ ID NO:4 with at least one modification selected from thegroup consisting of amino acid substitutions, amino acid insertions,amino acid deletions, carboxyl-terminal truncations, and amino-terminaltruncations and wherein the polypeptide has an activity of thepolypeptide set forth in either SEQ ID NO:2 or SEQ ID NO:4.

In addition, the polypeptide comprising the amino acid sequence ofeither SEQ ID NO:2 or SEQ ID NO:4, or other res02 or inv19 polypeptide,can be fused to a homologous polypeptide to form a homodimer or to aheterologous polypeptide to form a heterodimer. Heterologous peptidesand polypeptides include, but are not limited to: an epitope to allowfor the detection and/or isolation of a res02 or inv19 fusionpolypeptide; a transmembrane receptor protein or a portion thereof, suchas an extracellular domain or a transmembrane and intracellular domain;a ligand or a portion thereof which binds to a transmembrane receptorprotein; an enzyme or portion thereof which is catalytically active; apolypeptide or peptide which promotes oligomerization, such as a leucinezipper domain; a polypeptide or peptide which increases stability, suchas an immunoglobulin constant region; and a polypeptide which has atherapeutic activity different from the polypeptide comprising the aminoacid sequence as set forth in either SEQ ID NO:2 or SEQ ID NO:4, orother res02 or inv19 polypeptide.

Fusions can be made either at the amino-terminus or at thecarboxyl-terminus of the polypeptide comprising the amino acid sequenceset forth in either SEQ ID NO:2 or SEQ ID NO:4, or other res02 or inv19polypeptide. Fusions can be direct with no linker or adapter molecule,or they can be through a linker or adapter molecule. A linker or adaptermolecule can be one or more amino acid residues, typically from about 20to about 50 amino acid residues. A linker or adapter molecule can alsobe designed with a cleavage site for a DNA restriction endonuclease orfor a protease to allow for the separation of the fused moieties. Itwill be appreciated that once constructed, the fusion polypeptides canbe derivatized according to the methods described herein.

In a further embodiment of the invention, the polypeptide comprising theamino acid sequence of either SEQ ID NO:2 or SEQ ID NO:4, or other res02or inv19 polypeptide, is fused to one or more domains of an Fc region ofhuman IgG. Antibodies comprise two functionally distinct parts, avariable domain known as “Fab” that binds an antigen, and a constantdomain known as “Fc” that is involved in effector functions such ascomplement activation and attack by phagocytic cells. An Fc has a longserum half-life, whereas an Fab is short-lived. Capon D J et al. (1989)Nature 337:525-31. When constructed together with a therapeutic protein,an Fc domain can provide longer half-life or incorporate such functionsas Fc receptor binding, protein A binding, complement fixation, andperhaps even placental transfer.

In one example, a human IgG hinge, CH2, and CH3 region can be fused ateither the amino-terminus or carboxyl-terminus of the res02 polypeptidesusing methods known to the skilled artisan. In another example, a humanIgG hinge, CH2, and CH3 region can be fused at either the amino-terminusor carboxyl-terminus of a res02 polypeptide fragment.

The resulting fusion polypeptide may be purified by use of a Protein Aaffinity column. Peptides and proteins fused to an Fc region have beenfound to exhibit a substantially greater half-life in vivo than theunfused counterpart. Also, a fusion to an Fc region allows fordimerization/multimerization of the fusion polypeptide. The Fc regioncan be a naturally occurring Fc region, or it can be altered to improvecertain qualities, such as therapeutic qualities, circulation time, orreduced aggregation.

Identity and similarity of related nucleic acid molecules andpolypeptides are readily calculated by known methods. Such methodsinclude, but are not limited to those described in ComputationalMolecular Biology (A. M. Lesk, ed., Oxford University Press 1988);Biocomputing: Informatics and Genome Projects (D. W. Smith, ed.,Academic Press 1993); Computer Analysis of Sequence Data (Part 1, A. M.Griffin and H. G. Griffin, eds., Humana Press 1994); G. von Heinle,Sequence Analysis in Molecular Biology (Academic Press 1987); SequenceAnalysis Primer (M. Gribskov and J. Devereux, eds., M. Stockton Press1991); and Carillo et al. (1988) SIAM J Applied Math 48:1073.

Preferred methods to determine identity and/or similarity are designedto give the largest match between the sequences tested. Methods todetermine identity and similarity are described in publicly availablecomputer programs. Preferred computer program methods to determineidentity and similarity between two sequences include, but are notlimited to, the GCG program package, including GAP (Devereux J et al.(1984) Nucleic Acids Res 12(1 Pt 1):387-95; Genetics Computer Group,University of Wisconsin, Madison, Wis.), BLASTP, BLASTN, and FASTA(Altschul S F et al. (1990) J Mol Biol 215:403-10). The BLASTX programis publicly available from the National Center for BiotechnologyInformation (NCBI) and other sources (Altschul et al., BLAST Manual (NCBNLM NIH, Bethesda, Md.); Altschul S F et al. (1990) J Mol Biol215:403-10). The well-known Smith Waterman algorithm can also be used todetermine identity.

Certain alignment schemes for aligning two amino acid sequences canresult in the matching of only a short region of the two sequences, andthis small aligned region can have very high sequence identity eventhough there is no significant relationship between the two full-lengthsequences. Accordingly, in one embodiment, the selected alignment method(e.g., GAP program) will result in an alignment that spans at least 50contiguous amino acids of the claimed polypeptide.

For example, using the computer algorithm GAP (Genetics Computer Group,University of Wisconsin, Madison, Wis.), two polypeptides for which thepercent sequence identity is to be determined are aligned for optimalmatching of their respective amino acids (the “matched span,” asdetermined by the algorithm). A gap opening penalty (which is calculatedas 3× the average diagonal; the “average diagonal” is the average of thediagonal of the comparison matrix being used; the “diagonal” is thescore or number assigned to each perfect amino acid match by theparticular comparison matrix) and a gap extension penalty (which isusually 0.1× the gap opening penalty), as well as a comparison matrixsuch as PAM 250 or BLOSUM 62 are used in conjunction with the algorithm.A standard comparison matrix is also used by the algorithm. See: Dayhoffet al., 5 Atlas of Protein Sequence and Structure (Supp. 3 1978) (PAM250comparison matrix); Henikoff S et al. (1992) Proc Natl Acad Sci USA89:10915-19 (BLOSUM 62 comparison matrix).

Preferred parameters for polypeptide sequence comparison include thefollowing: Algorithm: Needleman S B et al. (1970) J Mol Biol 48:443-53;Comparison matrix: BLOSUM 62 (Henikoff et al, supra); Gap Penalty: 12;Gap Length Penalty: 4; Threshold of Similarity: 0.

The GAP program is useful with the above parameters. The aforementionedparameters are the default parameters for polypeptide comparisons (alongwith no penalty for end gaps) using the GAP algorithm.

Preferred parameters for nucleic acid molecule sequence comparisoninclude the following: Algorithm: Needleman et al., supra; Comparisonmatrix: matches=+10, mismatch=0; Gap Penalty: 50; Gap Length Penalty: 3.

The GAP program is also useful with the above parameters. Theaforementioned parameters are the default parameters for nucleic acidmolecule comparisons.

Other exemplary algorithms, gap opening penalties, gap extensionpenalties, comparison matrices, and thresholds of similarity can beused, including those set forth in the Program Manual, WisconsinPackage, Version 9, September, 1997. The particular choices to be madewill be apparent to those of skill in the art and will depend on thespecific comparison to be made, such as DNA-to-DNA, protein-to-protein,protein-to-DNA; and additionally, whether the comparison is betweengiven pairs of sequences (in which case GAP or BestFit are generallypreferred) or between one sequence and a large database of sequences (inwhich case FASTA or BLASTA are preferred).

Nucleic Acid Molecules

The nucleic acid molecules encoding a polypeptide comprising the aminoacid sequence of a res02 or inv19 polypeptide can readily be obtained ina variety of ways including, without limitation, chemical synthesis,genomic library screening, expression library screening, and/or PCRamplification of genomic DNA.

Recombinant DNA methods used herein are generally those set forth inSambrook et al., Molecular Cloning: A Laboratory Manual (2nd ed., ColdSpring Harbor Laboratory Press, 1989) and/or Current Protocols inMolecular Biology (Ausubel et al., eds., Green Publishers Inc. and Wileyand Sons 1994). The invention provides for nucleic acid molecules asdescribed herein and methods for obtaining such molecules.

Where a gene encoding the amino acid sequence of a res02 or inv19polypeptide has been identified from one species, all or a portion ofthat gene can be used as a probe to identify orthologs or related genesfrom the same species. The probes or primers can be used to screengenomic DNA from various bacteria believed to express the res02 or inv19polypeptide. In addition, part or all of a nucleic acid molecule havingthe sequence as set forth in either SEQ ID NO:1 or SEQ ID NO:3 can beused to screen a genomic library to identify and isolate a gene encodingthe amino acid sequence of a res02 or inv19 polypeptide. Typically,conditions of moderate or high stringency will be employed for screeningto minimize the number of false positives obtained from the screening.

Nucleic acid molecules encoding the amino acid sequence of res02 orinv19 polypeptides can also be identified by expression cloning whichemploys the detection of positive clones based upon a property of theexpressed protein. Typically, nucleic acid libraries are screened by thebinding an antibody or other binding partner (e.g., receptor or ligand)to cloned proteins that are expressed and displayed on a host cellsurface. The antibody or binding partner is modified with a detectablelabel to identify those cells expressing the desired clone.

Recombinant expression techniques conducted in accordance with thedescriptions set forth below can be followed to produce thesepolynucleotides and to express the encoded polypeptides. For example, byinserting a nucleic acid sequence that encodes the amino acid sequenceof a res02 or inv19 polypeptide into an appropriate vector, one skilledin the art can readily produce large quantities of the desirednucleotide sequence. The sequences can then be used to generatedetection probes or amplification primers. Alternatively, apolynucleotide encoding the amino acid sequence of a res02 or inv19polypeptide can be inserted into an expression vector. By introducingthe expression vector into an appropriate host, the encoded res02 orinv19 polypeptide can be produced in large amounts.

Another method for obtaining a suitable nucleic acid sequence is thepolymerase chain reaction (PCR). In this method, DNA template isprepared from an appropriate source. Two primers, typicallycomplementary to two separate regions of DNA encoding the amino acidsequence of a res02 or inv19 polypeptide, are then added to the templateDNA along with a polymerase such as Taq polymerase, and the polymeraseamplifies the DNA region between the two primers.

Another means of preparing a nucleic acid molecule encoding the aminoacid sequence of a res02 or inv19-polypeptide is chemical synthesisusing methods well known to the skilled artisan such as those describedby Engels et al. (1989) Angew Chem Intl Ed 28:716-34. These methodsinclude, inter alia, the phosphotriester, phosphoramidite, andH-phosphonate methods for nucleic acid synthesis. A preferred method forsuch chemical synthesis is polymer-supported synthesis using standardphosphoramidite chemistry. Typically, the DNA encoding the amino acidsequence of a res02 or inv19 polypeptide will be several hundrednucleotides in length. Nucleic acids larger than about 100 nucleotidescan be synthesized as several fragments using these methods. Thefragments can then be ligated together to form the full-lengthnucleotide sequence of a res02 or inv19 gene. Other methods known to theskilled artisan can be used as well.

In certain embodiments, nucleic acid variants contain codons which havebeen altered for optimal expression of a res02 or inv19 polypeptide in agiven host cell. Particular codon alterations will depend upon the res02or inv19 polypeptide and host cell selected for expression. Such “codonoptimization” can be carried out by a variety of methods, for example,by selecting codons which are preferred for use in highly expressedgenes in a given host cell. Computer algorithms which incorporate codonfrequency tables such as “Eco_high.Cod” for codon preference of highlyexpressed bacterial genes can be used and are provided by the Universityof Wisconsin Package Version 9.0 (Genetics Computer Group, Madison,Wis.). Other useful codon frequency tables include “Celegans_high.cod,”“Celegans_low.cod,” “Drosophila_high.cod,” “Human_high.cod,”“Maize_high.cod,” and “Yeast_high.cod.”

In some cases, it may be desirable to prepare nucleic acid moleculesencoding res02 or inv19 polypeptide variants. Nucleic acid moleculesencoding variants can be produced using site directed mutagenesis, PCRamplification, or other appropriate methods, where the primer(s) havethe desired point mutations (see Sambrook et al., supra, and Ausubel etal., supra, for descriptions of mutagenesis techniques). Chemicalsynthesis using methods described by Engels et al., supra, can also beused to prepare such variants. Other methods known to the skilledartisan may be used as well.

Vectors and Host Cells

A nucleic acid molecule encoding the amino acid sequence of a res02 orinv19 polypeptide is inserted into an appropriate expression vectorusing standard ligation techniques. The vector is typically selected tobe functional in the particular host cell employed (i.e., the vector iscompatible with the host cell machinery such that amplification of thegene and/or expression of the gene can occur). A nucleic acid moleculeencoding the amino acid sequence of a res02 or inv19 polypeptide can beamplified/expressed in prokaryotic, yeast, insect (baculovirus systems)and/or eukaryotic host cells. Selection of the host cell will depend inpart on whether a res02 or inv19 polypeptide is to bepost-translationally modified (e.g., glycosylated and/orphosphorylated). If so, yeast, insect, or mammalian host cells arepreferable. For a review of expression vectors, see Meth. Enz., vol. 185(D. V. Goeddel, ed., Academic Press 1990).

Typically, expression vectors used in any of the host cells will containsequences for plasmid maintenance and for cloning and expression ofexogenous nucleotide sequences. Such sequences, collectively referred toas “flanking sequences” in certain embodiments will typically includeone or more of the following nucleotide sequences: a promoter, one ormore enhancer sequences, an origin of replication, a transcriptionaltermination sequence, a complete intron sequence containing a donor andacceptor splice site, a sequence encoding a leader sequence forpolypeptide secretion, a ribosome binding site, a polyadenylationsequence, a polylinker region for inserting the nucleic acid encodingthe polypeptide to be expressed, and a selectable marker element. Forexpression of res02 or inv19, expression vectors used in any of the hostcells typically will include one or more of the following nucleotidesequences: a promoter, one or more enhancer sequences, an origin ofreplication, a transcriptional termination sequence, a sequence encodinga leader sequence for polypeptide secretion, a ribosome binding site, apolylinker region for inserting the nucleic acid encoding thepolypeptide to be expressed, and a selectable marker element.

Optionally, the vector can contain a “tag”-encoding sequence, i.e., anoligonucleotide molecule located at the 5′ or 3′ end of the res02 orinv19 polypeptide coding sequence; the oligonucleotide sequence encodespolyHis (such as hexaHis), or another “tag” such as FLAG, HA(hemaglutinin influenza virus), or myc for which commercially availableantibodies exist. This tag is typically fused to the polypeptide uponexpression of the polypeptide, and can serve as a means for affinitypurification of the res02 or inv19 polypeptide from the host cell.Affinity purification can be accomplished, for example, by columnchromatography using antibodies against the tag as an affinity matrix.Optionally, the tag can subsequently be removed from the purified res02or inv19 polypeptide by various means such as using certain peptidasesfor cleavage.

Flanking sequences can be homologous (i.e., from the same species and/orstrain as the host cell), heterologous (i.e., from a species other thanthe host cell species or strain), hybrid (i.e., a combination offlanking sequences from more than one source), or synthetic, or theflanking sequences can be native sequences which normally function toregulate res02 or inv19 polypeptide expression. As such, the source of aflanking sequence can be any prokaryotic or eukaryotic organism, anyvertebrate or invertebrate organism, or any plant, provided that theflanking sequence is functional in, and can be activated by, the hostcell machinery.

Flanking sequences useful in the vectors of this invention can beobtained by any of several methods well known in the art. Typically,flanking sequences useful herein, other than the res02 or inv19 geneflanking sequences, will have been previously identified by mappingand/or by restriction endonuclease digestion and can thus be isolatedfrom the proper source using the appropriate restriction endonucleases.In some cases, the full nucleotide sequence of a flanking sequence maybe known. Here, the flanking sequence can be synthesized using themethods described herein for nucleic acid synthesis or cloning.

Where all or only a portion of the flanking sequence is known, it can beobtained using PCR and/or by screening a genomic library with a suitableoligonucleotide and/or flanking sequence fragment from the same oranother species. Where the flanking sequence is not known, a fragment ofDNA containing a flanking sequence can be isolated from a larger pieceof DNA that may contain, for example, a coding sequence or even anothergene or genes. Isolation can be accomplished by restriction endonucleasedigestion to produce the proper DNA fragment followed by isolation usingagarose gel purification, Qiagen® column chromatography (Chatsworth,Calif.), or other methods known to the skilled artisan. The selection ofsuitable enzymes to accomplish this purpose will be readily apparent toone of ordinary skill in the art.

An origin of replication is typically a part of those prokaryoticexpression vectors purchased commercially, and the origin aids in theamplification of the vector in a host cell. Amplification of the vectorto a certain copy number can, in some cases, be important for theoptimal expression of a res02 or inv19 polypeptide. If the vector ofchoice does not contain an origin of replication site, one may bechemically synthesized based on a known sequence, and ligated into thevector. For example, the origin of replication from the plasmid pBR322(New England Biolabs, Beverly, Mass.) is suitable for most gram-negativebacteria and various origins (e.g., SV40, polyoma, adenovirus, vesicularstomatitus virus (VSV), or papillomaviruses such as HPV or BPV) areuseful for cloning vectors in mammalian cells. Generally, the origin ofreplication component is not needed for mammalian expression vectors(for example, the SV40 origin is often used only because it contains theearly promoter).

A transcription termination sequence is typically located 3′ of the endof a polypeptide coding region and serves to terminate transcription.Usually, a transcription termination sequence in prokaryotic cells is aG-C rich fragment followed by a poly-T sequence. While the sequence iseasily cloned from a library or even purchased commercially as part of avector, it can also be readily synthesized using methods for nucleicacid synthesis such as those described herein.

A selectable marker gene element encodes a protein necessary for thesurvival and growth of a host cell grown in a selective culture medium.Typical selection marker genes encode proteins that (a) conferresistance to antibiotics or other toxins, e.g., ampicillin,tetracycline, or kanamycin for prokaryotic host cells; (b) complementauxotrophic deficiencies of the cell; or (c) supply critical nutrientsnot available from complex media. A neomycin resistance gene can also beused for selection in prokaryotic and eukaryotic host cells.

Other selection genes can be used to amplify the gene that will beexpressed. Amplification is the process wherein genes that are ingreater demand for the production of a protein critical for growth arereiterated in tandem within the chromosomes of successive generations ofrecombinant cells. Examples of suitable selectable markers for mammaliancells include dihydrofolate reductase (DHFR) and thymidine kinase. Themammalian cell transformants are placed under selection pressure whereinonly the transformants are uniquely adapted to survive by virtue of theselection gene present in the vector. Selection pressure is imposed byculturing the transformed cells under conditions in which theconcentration of selection agent in the medium is successively changed,thereby leading to the amplification of both the selection gene and theDNA that encodes a res02 or inv19 polypeptide. As a result, increasedquantities of res02 or inv19 polypeptide are synthesized from theamplified DNA.

A ribosome binding site is usually necessary for translation initiationof mRNA and is characterized by a Shine-Dalgamo sequence (prokaryotes)or a Kozak sequence (eukaryotes). The element is typically located 3′ tothe promoter and 5′ to the coding sequence of a res02 or inv19polypeptide to be expressed. The Shine-Dalgamo sequence is varied but istypically a polypurine (i.e., having a high A-G content). ManyShine-Dalgamo sequences have been identified, each of which can bereadily synthesized using methods set forth herein and used in aprokaryotic vector.

A leader, or signal, sequence can be used to direct a res02 or inv19polypeptide out of the host cell. Typically, a nucleotide sequenceencoding the signal sequence is positioned in the coding region of ares02 or inv19 nucleic acid molecule, or directly at the 5′ end of ares02 or inv19 polypeptide coding region. Many signal sequences havebeen identified, and any of those that are functional in the selectedhost cell can be used in conjunction with a res02 or inv19 nucleic acidmolecule. Additionally, a signal sequence can be chemically synthesizedusing methods described herein. In most cases, the secretion of a res02or inv19 polypeptide from the host cell via the presence of a signalpeptide will result in the removal of the signal peptide from thesecreted res02 or inv19 polypeptide. The signal sequence can be acomponent of the vector, or it can be a part of a res02 or inv19 nucleicacid molecule that is inserted into the vector.

Included within the scope of this invention is the use of either anucleotide sequence encoding a native res02 or inv19 polypeptide withouta signal sequence or a nucleotide sequence encoding a heterologoussignal sequence joined to a res02 or inv19 polypeptide coding region.The heterologous signal sequence selected should be one that isrecognized and processed, i.e., cleaved by a signal peptidase, by thehost cell. For prokaryotic host cells, the signal sequence is providedby a prokaryotic signal sequence selected, for example, from the groupof the alkaline phosphatase, penicillinase, or heat-stable enterotoxinII leaders. For yeast secretion, the signal sequence can be provided bythe yeast invertase, alpha factor, or acid phosphatase leaders. Inmammalian cell expression any suitable mammalian signal sequence can beemployed.

In some cases, such as where glycosylation is desired in a eukaryotichost cell expression system, one can manipulate the various presequencesto improve glycosylation or yield. For example, one can alter thepeptidase cleavage site of a particular signal peptide, or addpro-sequences, which also can affect glycosylation. The final proteinproduct can have, in the −1 position (relative to the first amino acidof the mature protein) one or more additional amino acids incident toexpression, which may not have been totally removed. For example, thefinal protein product can have one or two amino acid residues found inthe peptidase cleavage site, attached to the amino-terminus.Alternatively, use of some enzyme cleavage sites may result in aslightly truncated form of the desired res02 or inv19 polypeptide, ifthe enzyme cuts at such area within the mature polypeptide.

The expression and cloning vectors of the present invention willtypically contain a promoter that is recognized by the host organism andoperably linked to the molecule encoding the res02 or inv19 polypeptide.Promoters are untranscribed sequences located upstream (i.e., 5′) to thestart codon of a structural gene (generally within about 100 to 1000 bp)that control the transcription of the structural gene. Promoters areconventionally grouped into one of two classes: inducible promoters andconstitutive promoters. Inducible promoters initiate increased levels oftranscription from DNA under their control in response to some change inculture conditions, such as the presence or absence of a nutrient or achange in temperature. Constitutive promoters, on the other hand,initiate continual gene product production; that is, there is little orno control over gene expression. A large number of promoters, recognizedby a variety of potential host cells, are well known. A suitablepromoter is operably linked to the DNA encoding res02 or inv19polypeptide by removing the promoter from the source DNA by restrictionenzyme digestion and inserting the desired promoter sequence into thevector. The native res02 or inv19 promoter sequence can be used todirect amplification and/or expression of a res02 or inv19 nucleic acidmolecule. A heterologous promoter is preferred, however, if it permitsgreater transcription and higher yields of the expressed protein ascompared to the native promoter, and if it is compatible with the hostcell system that has been selected for use.

Promoters suitable for use with prokaryotic hosts include thebeta-lactamase and lactose promoter systems; alkaline phosphatase; atryptophan (trp) promoter system; and hybrid promoters such as the tacpromoter. Other known bacterial promoters are also suitable. Theirsequences have been published, thereby enabling one skilled in the artto ligate them to the desired DNA sequence, using linkers or adapters asneeded to supply any useful restriction sites.

Suitable promoters for use with yeast hosts are also well known in theart. Yeast enhancers are advantageously used with yeast promoters.Suitable promoters for use with mammalian host cells are well known andinclude, but are not limited to, those obtained from the genomes ofviruses such as polyoma virus, fowlpox virus, adenovirus (such asAdenovirus 2), bovine papilloma virus, avian sarcoma virus,cytomegalovirus (CMV), retroviruses, hepatitis-B virus and Simian Virus40 (SV40). Other suitable mammalian promoters include heterologousmammalian promoters, for example, heat-shock promoters and the actinpromoter.

Additional promoters which may be of interest in controlling res02 orinv19 gene expression include, but are not limited to: the SV40 earlypromoter region (Benoist C et al. (1981) Nature 290:304-10); the CMVpromoter; the promoter contained in the 3′ long terminal repeat of Roussarcoma virus (Yamamoto T et al. (1980) Cell 22:787-97); the herpesthymidine kinase promoter (Wagner M J et al. (1981) Proc Natl Acad SciUSA 78:1441-45); the regulatory sequences of the metallothionine gene(Brinster R L et al. (1982) Nature 296:39-42); prokaryotic expressionvectors such as the beta-lactamase promoter (Villa-Kamaroff et al.(1978) Proc Natl Acad Sci USA 75:3727-31); or the tac promoter (DeBoeret al. (1983) Proc Natl Acad Sci USA 80:21-25). Also of interest are thefollowing animal transcriptional control regions, which exhibit tissuespecificity and have been utilized in transgenic animals: the elastase Igene control region which is active in pancreatic acinar cells (Swift etal. (1984) Cell 38:639-46; Omitz et al. (1986) Cold Spring Harbor SympQuant Biol 50:399-409; MacDonald (1987) Hepatology 7:425-515); theinsulin gene control region which is active in pancreatic beta cells(Hanahan (1985) Nature 315:115-22); the immunoglobulin gene controlregion which is active in lymphoid cells (Grosschedl R et al. (1984)Cell 38:647-58; Adams J M et al. (1985) Nature 318:533-38; Alexander W Set al. (1987) Mol Cell Biol 7:1436-44); the mouse mammary tumor viruscontrol region which is active in testicular, breast, lymphoid and mastcells (Leder P et al. (1986) Cell 45:485-95); the albumin gene controlregion which is active in liver (Pinkert C A et al. (1987) Genes Dev1:268-76); the alpha-fetoprotein gene control region which is active inliver (Krumlauf R et al. (1985) Mol Cell Biol 5:1639-48; Hammer R E etal. (1987) Science 235:53-58); the alpha 1-antitrypsin gene controlregion which is active in the liver (Kelsey G D et al. (1987) Genes Dev1: 161-71); the beta-globin gene control region which is active inmyeloid cells (Magram J et al. (1985) Nature 315:338-40; Kollias G etal. (1986) Cell 46:89-94); the myelin basic protein gene control regionwhich is active in oligodendrocyte cells in the brain (Readhead C et al.(1987) Cell 48:703-12); the myosin light chain-2 gene control regionwhich is active in skeletal muscle (Shani M (1985) Nature 314:283-86);and the gonadotropic releasing hormone gene control region which isactive in the hypothalamus (Mason A J et al. (1986) Science234:1372-78).

An enhancer sequence can be inserted into the vector to increase thetranscription of a DNA encoding a res02 or inv19 polypeptide of thepresent invention by higher eukaryotes. Enhancers are cis-actingelements of DNA, usually about 10-300 bp in length, that act on thepromoter to increase transcription. Enhancers are relatively orientationand position independent. They have been found 5′ and 3′ to thetranscription unit. Several enhancer sequences available from mammaliangenes are known (e.g., globin, elastase, albumin, alpha-fetoprotein andinsulin). Typically, however, an enhancer from a virus will be used. TheSV40 enhancer, the cytomegalovirus early promoter enhancer, the polyomaenhancer, and adenovirus enhancers are exemplary enhancing elements forthe activation of eukaryotic promoters. While an enhancer can be splicedinto the vector at a position 5′ or 3′ to a res02 or inv19 nucleic acidmolecule, it is typically located at a site 5′ from the promoter.

Expression vectors of the invention can be constructed from a startingvector such as a commercially available vector. Such vectors may or maynot contain all of the desired flanking sequences. Where one or more ofthe flanking sequences described herein are not already present in thevector, they can be individually obtained and ligated into the vector.Methods used for obtaining each of the flanking sequences are well knownto one skilled in the art.

Preferred vectors for practicing this invention are those which arecompatible with bacterial, insect, and mammalian host cells. Suchvectors include, inter alia, pCR11, pCR3, and pcDNA3.1 (Invitrogen, SanDiego, Calif.), pBSII (Stratagene, La Jolla, Calif.), pET15 (Novagen,Madison, Wis.), pGEX (Pharmacia Biotech, Piscataway, N.J.), pEGFP-N2(Clontech, Palo Alto, Calif.), pETL (BlueBacli, Invitrogen), pDSR-alpha(PCT Pub. No. WO 90/14363) and pFastBacDual (Gibco-BRL, Grand Island,N.Y.).

Additional suitable vectors include, but are not limited to, cosmids,plasmids, or modified viruses, but it will be appreciated that thevector system must be compatible with the selected host cell. Suchvectors include, but are not limited to plasmids such as Bluescript®plasmid derivatives (a high copy number ColE1-based phagemid, StratageneCloning Systems, La Jolla, Calif.), PCR cloning plasmids designed forcloning Taq-amplified PCR products (e.g., TOPO™ TA Cloning® Kit, PCR2.1®plasmid derivatives, Invitrogen, Carlsbad, Calif.), and mammalian, yeastor virus vectors such as a baculovirus expression system (pBacPAKplasmid derivatives, Clontech, Palo Alto, Calif.).

After the vector has been constructed and a nucleic acid moleculeencoding a res02 or inv19 polypeptide has been inserted into the propersite of the vector, the completed vector can be inserted into a suitablehost cell for amplification and/or polypeptide expression. Thetransformation of an expression vector for a res02 or inv19 polypeptideinto a selected host cell can be accomplished by well known methodsincluding methods such as transfection, infection, calcium chloride,electroporation, microinjection, lipofection, DEAE-dextran method, orother known techniques. The method selected will in part be a functionof the type of host cell to be used. These methods and other suitablemethods are well known to the skilled artisan, and are set forth, forexample, in Sambrook et al., supra.

Host cells can be prokaryotic host cells (such as E. coli) or eukaryotichost cells (such as a yeast, insect, or vertebrate cell). The host cell,when cultured under appropriate conditions, synthesizes a res02 or inv19polypeptide which can subsequently be collected from the culture medium(if the host cell secretes it into the medium) or directly from the hostcell producing it (if it is not secreted). The selection of anappropriate host cell will depend upon various factors, such as desiredexpression levels, polypeptide modifications that are desirable ornecessary for activity (such as glycosylation or phosphorylation) andease of folding into a biologically active molecule.

A number of suitable host cells are known in the art and many areavailable from the American Type Culture Collection (ATCC), Manassas,Va. Examples include, but are not limited to, mammalian cells, such asChinese hamster ovary cells (CHO), CHO DHFR(−) cells (Urlaub G et al.(1980) Proc Natl Acad Sci USA 97:4216-20), human embryonic kidney (HEK)293 or 293T cells, or 3T3 cells. The selection of suitable mammalianhost cells and methods for transformation, culture, amplification,screening, product production, and purification are known in the art.Other suitable mammalian cell lines, are the monkey COS-1 and COS-7 celllines, and the CV-1 cell line. Further exemplary mammalian host cellsinclude primate cell lines and rodent cell lines, including transformedcell lines. Normal diploid cells, cell strains derived from in vitroculture of primary tissue, as well as primary explants, are alsosuitable. Candidate cells may be genotypically deficient in theselection gene, or may contain a dominantly acting selection gene. Othersuitable mammalian cell lines include but are not limited to, mouseneuroblastoma N2A cells, HeLa, mouse L-929 cells, 3T3 lines derived fromSwiss, BALB/c or NIH mice, BHK or HaK hamster cell lines. Each of thesecell lines is known by and available to those skilled in the art ofprotein expression.

Similarly useful as host cells suitable for the present invention arebacterial cells. For example, the various strains of E. coli (e.g.,HB101, DH5α, DH10, and MC1061) are well-known as host cells in the fieldof biotechnology. Various strains of B. subtilis, Pseudomonas spp.,other Bacillus spp., Streptomyces spp., and the like can also beemployed in this method.

Many strains of yeast cells known to those skilled in the art are alsoavailable as host cells for the expression of the polypeptides of thepresent invention. Preferred yeast cells include, for example,Saccharomyces cerivisae and Pichia pastoris.

Additionally, where desired, insect cell systems can be utilized in themethods of the present invention. Such systems are described, forexample, in Kitts P A et al. (1993) Biotechniques 14:810-17; Lucklow V A(1993) Curr Opin Biotechnol 4:564-72; and Lucklow V A et al. (1993) JVirol 67:4566-79. Preferred insect cells are Sf-9 and Hi5 (Invitrogen).

One can also use transgenic animals to express glycosylated res02 orinv19 polypeptides. For example, one can use a transgenic milk-producinganimal (a cow or goat, for example) and obtain the present glycosylatedpolypeptide in the animal milk. One can also use plants to produce res02or inv19 polypeptides, however, in general, the glycosylation occurringin plants is different from that produced in mammalian cells, and canresult in a glycosylated product which is not suitable for humantherapeutic use.

Polypeptide Production

Host cells comprising a res02 or inv19 polypeptide expression vector canbe cultured using standard media well known to the skilled artisan. Themedia will usually contain all nutrients necessary for the growth andsurvival of the cells. Suitable media for culturing E. coli cellsinclude, for example, Luria Broth (LB) and/or Terrific Broth (TB).Suitable media for culturing eukaryotic cells include Roswell ParkMemorial Institute medium 1640 (RPMI 1640), Minimal Essential Medium(MEM) and/or Dulbecco's Modified Eagle Medium (DMEM), all of which canbe supplemented with serum and/or growth factors as necessary for theparticular cell line being cultured. A suitable medium for insectcultures is Grace's medium supplemented with yeastolate, lactalbuminhydrolysate, and/or fetal calf serum as necessary.

Typically, an antibiotic or other compound useful for selective growthof transfected or transformed cells is added as a supplement to themedia. The compound to be used will be dictated by the selectable markerelement present on the plasmid with which the host cell was transformed.For example, where the selectable marker element is kanamycinresistance, the compound added to the culture medium will be kanamycin.Other compounds for selective growth include ampicillin, tetracycline,and neomycin.

The amount of a res02 or inv19 polypeptide produced by a host cell canbe evaluated using standard methods known in the art. Such methodsinclude, without limitation, Western immunoblot analysis,SDS-polyacrylamide gel electrophoresis, non-denaturing gelelectrophoresis, high performance liquid chromatography (HPLC)separation, immunoprecipitation, and/or activity assays such as DNAbinding gel shift assays.

If a res02 or inv19 polypeptide has been designed to be secreted fromthe host cells, the majority of polypeptide may be found in the cellculture medium. If, however, the res02 or inv19 polypeptide is notsecreted from the host cells, it will be present in the cytoplasm and/orthe nucleus (for eukaryotic host cells) or in the cytosol (for bacterialhost cells).

For a res02 or inv19 polypeptide situated in the host cell cytoplasmand/or nucleus (for eukaryotic host cells) or in the cytosol (forbacterial host cells), the intracellular material (including inclusionbodies for gram-negative bacteria) can be extracted from the host cellusing any standard technique known to the skilled artisan. For example,the host cells can be lysed to release the contents of theperiplasm/cytoplasm by French press, homogenization, and/or sonicationfollowed by centrifugation.

If a res02 or inv19 polypeptide has formed inclusion bodies in thecytosol, the inclusion bodies can often bind to the inner and/or outercellular membranes and thus will be found primarily in the pelletmaterial after centrifugation. The pellet material can then be treatedat pH extremes or with a chaotropic agent such as a detergent,guanidine, guanidine derivatives, urea, or urea derivatives in thepresence of a reducing agent such as dithiothreitol at alkaline pH ortris carboxyethyl phosphine at acid pH to release, break apart, andsolubilize the inclusion bodies. The solubilized res02 or inv19polypeptide can then be analyzed using gel electrophoresis,immunoprecipitation, or the like. If it is desired to isolate the res02or inv19 polypeptide, isolation can be accomplished using standardmethods such as those described herein and in Marston F A et al. (1990)Methods Enzymol 182:264-76.

In some cases, a res02 or inv19 polypeptide may not be biologicallyactive upon isolation. Various methods for “refolding” or converting thepolypeptide to its tertiary structure and generating disulfide linkagescan be used to restore biological activity. Such methods includeexposing the solubilized polypeptide to a pH usually above 7 and in thepresence of a particular concentration of a chaotrope. The selection ofchaotrope is very similar to the choices used for inclusion bodysolubilization, but usually the chaotrope is used at a lowerconcentration and is not necessarily the same as chaotropes used for thesolubilization. In most cases the refolding/oxidation solution will alsocontain a reducing agent or the reducing agent plus its oxidized form ina specific ratio to generate a particular redox potential allowing fordisulfide shuffling to occur in the formation of the protein's cysteinebridges. Some of the commonly used redox couples includecysteine/cystamine, glutathione (GSH)/dithiobis GSH, cupric chloride,dithiothreitol (DTT)/dithiane DTT, and 2-2-mercaptoethanol(βME)/dithio-β(ME). In many instances, a cosolvent can be used or can beneeded to increase the efficiency of the refolding, and the more commonreagents used for this purpose include glycerol, polyethylene glycol ofvarious molecular weights, arginine and the like.

If inclusion bodies are not formed to a significant degree uponexpression of a res02 or inv19 polypeptide, then the polypeptide will befound primarily in the supernatant after centrifugation of the cellhomogenate. The polypeptide can be further isolated from the supernatantusing methods such as those described herein.

The purification of a res02 or inv19 polypeptide from solution can beaccomplished using a variety of techniques. If the polypeptide has beensynthesized such that it contains a tag such as Hexahistidine (res02 orinv19 polypeptide/hexaHis) or other small peptide such as FLAG (EastmanKodak Co., New Haven, Conn.) or myc (Invitrogen, Carlsbad, Calif.) ateither its carboxyl- or amino-terminus, it can be purified in a one-stepprocess by passing the solution through an affinity column where thecolumn matrix has a high affinity for the tag.

For example, polyhistidine binds with great affinity and specificity tonickel. Thus, an affinity column of nickel (such as the Qiagen® nickelcolumns) can be used for purification of res02 or inv19polypeptide/polyHis. See, e.g., Current Protocols in Molecular Biology10.11.8 (Ausubel et al., eds., Green Publishers, Inc. and Wiley andSons, 1993).

Additionally, res02 or inv19 polypeptides can be purified through theuse of a monoclonal antibody that is capable of specifically recognizingand binding to a res02 or inv19 polypeptide.

Other suitable procedures for purification include, without limitation,affinity chromatography, immunoaffinity chromatography, ion exchangechromatography, molecular sieve chromatography, HPLC, electrophoresis(including native gel electrophoresis) followed by gel elution, andpreparative isoelectric focusing (“Isoprime” machine/technique, HoeferScientific, San Francisco, Calif.). In some cases, two or morepurification techniques can be combined to achieve increased purity.

Res02 or inv19 polypeptides can also be prepared by chemical synthesismethods (such as solid phase peptide synthesis) using techniques knownin the art such as those set forth by Merrifield et al. (1963) J Am ChemSoc 85:2149; Houghten R A et al. (1985) Proc Natl Acad Sci USA82:5131-35; and Stewart and Young, Solid Phase Peptide Synthesis (PierceChemical Co., 1984). Such polypeptides can be synthesized with orwithout a methionine on the amino-terminus. Chemically synthesized res02or inv19 polypeptides can be oxidized using methods set forth in thesereferences to form disulfide bridges. Chemically synthesized res02 orinv19 polypeptides are expected to have comparable biological activityto the corresponding res02 or inv19 polypeptides produced recombinantlyor purified from natural sources, and thus can be used interchangeablywith a recombinant or natural res02 or inv19 polypeptide.

Another means of obtaining res02 or inv19 polypeptide is viapurification from biological samples such as capsularpolysaccharide-expressing bacterial cells in which the res02 or inv19polypeptide is naturally found. Such purification can be conducted usingmethods for protein purification as described herein. The presence ofthe res02 or inv19 polypeptide during purification can be monitored, forexample, using an antibody prepared against recombinantly produced res02or inv19 polypeptide or peptide fragments thereof.

A number of additional methods for producing nucleic acids andpolypeptides are known in the art, and the methods can be used toproduce polypeptides having specificity for res02 or inv19 polypeptide.See, e.g., Roberts R W et al. (1997) Proc Natl Acad Sci USA94:12297-302, which describes the production of fusion proteins betweenan mRNA and its encoded peptide. See also, Roberts R W (1999) Curr OpinChem Biol 3:268-73. Additionally, U.S. Pat. No. 5,824,469 describesmethods for obtaining oligonucleotides capable of carrying out aspecific biological function. The procedure involves generating aheterogeneous pool of oligonucleotides, each having a 5′ randomizedsequence, a central preselected sequence, and a 3′ randomized sequence.The resulting heterogeneous pool is introduced into a population ofcells that do not exhibit the desired biological function.Subpopulations of the cells are then screened for those that exhibit apredetermined biological function. From that subpopulation,oligonucleotides capable of carrying out the desired biological functionare isolated.

U.S. Pat. Nos. 5,763,192; 5,814,476; 5,723,323; and 5,817,483 describeprocesses for producing peptides or polypeptides. This is done byproducing stochastic genes or fragments thereof, and then introducingthese genes into host cells which produce one or more proteins encodedby the stochastic genes. The host cells are then screened to identifythose clones producing peptides or polypeptides having the desiredactivity.

Another method for producing peptides or polypeptides is described inPCT/US98/20094 (WO99/15650) filed by Athersys, Inc. Known as “RandomActivation of Gene Expression for Gene Discovery” (RAGE-GD), the processinvolves the activation of endogenous gene expression or over-expressionof a gene by in situ recombination methods. For example, expression ofan endogenous gene is activated or increased by integrating a regulatorysequence into the target cell which is capable of activating expressionof the gene by non-homologous or illegitimate recombination. The targetDNA is first subjected to radiation, and a genetic promoter inserted.The promoter eventually locates a break at the front of a gene,initiating transcription of the gene. This results in expression of thedesired peptide or polypeptide.

It will be appreciated that these methods can also be used to createcomprehensive res02 or inv19 polypeptide expression libraries, which cansubsequently be used for high throughput phenotypic screening in avariety of assays, such as biochemical assays, cellular assays, andwhole organism assays (e.g., plant, mouse, etc.).

Selective Binding Agents

The term “selective binding agent” refers to a molecule that hasspecificity for one or more antigens selected from res02 polypeptide,inv19 polypeptide, PSA, PSB, PSC, PSD, PSE, PSF, PSG, PSH, and fragmentsthereof. Suitable selective binding agents include, but are not limitedto, antibodies and derivatives thereof, polypeptides, and smallmolecules. Suitable selective binding agents can be prepared usingmethods known in the art.

Selective binding agents such as antibodies and antibody fragments thatbind res02 polypeptide, inv19 polypeptide, PSA, PSB, PSC, PSD, PSE, PSF,PSG, PSH, and fragments thereof, are within the scope of the presentinvention. The antibodies can be polyclonal including monospecificpolyclonal; monoclonal (MAbs); recombinant; chimeric; humanized, such asCDR-grafted; human; single chain; and/or bispecific; as well asfragments, variants, or derivatives thereof. Antibody fragments includethose portions of the antibody that bind to an epitope of an antigen.Examples of such fragments include Fab, F(ab′) and F(ab′)₂ fragmentsgenerated by enzymatic cleavage of full-length antibodies. Other bindingfragments include those generated by recombinant DNA techniques, such asthe expression of recombinant plasmids containing nucleic acid sequencesencoding antibody variable regions.

Polyclonal antibodies directed toward an antigen generally are producedin animals (e.g., rabbits or mice) by means of multiple subcutaneous orintraperitoneal injections of antigen and an adjuvant. It may be usefulto conjugate an antigen to a carrier protein that is immunogenic in thespecies to be immunized, such as keyhole limpet hemocyanin, albumin,bovine thyroglobulin, or soybean trypsin inhibitor. Also, aggregatingagents such as alum are used to enhance the immune response. Afterimmunization, the animals are bled and the serum is assayed forantigen-specific antibody titer.

Monoclonal antibodies directed toward an antigen are produced using anymethod that provides for the production of antibody molecules bycontinuous cell lines in culture. Examples of suitable methods forpreparing monoclonal antibodies include the hybridoma methods of KohlerG et al. (1975) Nature 256:495-97 and the human B-cell hybridoma method(Kozbor D et al. (1984) J Immunol 133:3001-5; Brodeur et al., MonoclonalAntibody Production Techniques and Applications 51-63 (Marcel Dekker,Inc., 1987). Also provided by the invention are hybridoma cell linesthat produce monoclonal antibodies reactive with res02 polypeptide,inv19 polypeptide, PSA, PSB, PSC, PSD, PSE, PSF, PSG, PSH, and fragmentsthereof.

The antibodies of the invention can be employed in any known assaymethod, such as competitive binding assays, direct and indirect sandwichassays, Western immunoblot assays, and immunoprecipitation assays (Sola,Monoclonal Antibodies: A Manual of Techniques, CRC Press, Inc., 1987,pp. 147-158) for the detection and quantitation of antigen. Theantibodies will bind antigen with an affinity that is appropriate forthe assay method being employed.

For diagnostic applications, in certain embodiments, anti-res02, oranti-inv19, or anti-capsular polysaccharide antibodies can be labeledwith a detectable moiety. The detectable moiety can be any one that iscapable of producing, either directly or indirectly, a detectablesignal. For example, the detectable moiety can be a radioisotope, suchas ³H, ¹⁴C, ³²P, ³⁵S, or ¹²⁵I, a fluorescent or chemiluminescentcompound, such as fluorescein isothiocyanate, rhodamine, or luciferin;or an enzyme, such as alkaline phosphatase, beta-galactosidase, orhorseradish peroxidase (Bayer E A et al. (1990) Methods Enzymol184:138-60).

Competitive binding assays rely on the ability of a labeled standard(e.g., a res02 or inv19 polypeptide, a capsular polysaccharide, or animmunologically reactive portion thereof) to compete with the testsample analyte (a res02 or inv19 polypeptide or a capsularpolysaccharide) for binding with a limited amount of anti-res02,anti-inv19, or anti-capsular polysaccharide antibody. The amount of anantigen in the test sample is inversely proportional to the amount ofstandard that becomes bound to the antibodies. To facilitate determiningthe amount of standard that becomes bound, the antibodies typically areinsolubilized before or after the competition, so that the standard andanalyte that are bound to the antibodies may conveniently be separatedfrom the standard and analyte which remain unbound.

Sandwich assays typically involve the use of two antibodies, eachcapable of binding to a different immunogenic portion, or epitope, ofthe protein to be detected and/or quantitated. In a sandwich assay, thetest sample analyte is typically bound by a first antibody which isimmobilized on a solid support, and thereafter a second antibody bindsto the analyte, thus forming an insoluble three-part complex. See, e.g.,U.S. Pat. No. 4,376,110. The second antibody can itself be labeled witha detectable moiety (direct sandwich assays) or can be measured using ananti-immunoglobulin antibody that is labeled with a detectable moiety(indirect sandwich assays). For example, one type of sandwich assay isan enzyme-linked immunosorbent assay (ELISA), in which case thedetectable moiety is an enzyme.

EXAMPLES

The following examples are illustrative only and are not intended tolimit the scope of the invention in any way.

Example 1 Bacterial Strain and Isolation of B. fragilis Polysaccharide

B. fragilis NCTC9343 was originally obtained from the NationalCollection of Type Cultures (London, England), stored at −80° F. inyeast both until used, and grown anaerobically as previously described.Pantosti A et al. (1991) Infect Immun 59:2075-2082. The capsularpolysaccharide from B. fragilis NCTC9343 was isolated by hotphenol/water extraction and subsequent purification of PSA and PSBperformed as previously described. Tzianabos A et al. (1992) J Biol Chem267:18230-18235.

Example 2 Identification of res02 and inv19

Hypothesizing that the DNA inversions of the seven promoter regionswould be controlled by specific proteins that are involved inrecombining DNA, a search of B. fragilis NCTC9343 genomic sequencesprovided by the microbial pathogen group at the Sanger Centre wasperformed, looking for open reading frames (ORFs) with homology to fimBand fimE, two genes of E. coli that are involved in inverting DNA. FiveORFs were retrieved. These five B. fragilis ORFs were then used toreprobe the database, and twenty-five homologs were retrieved and giventemporary designations of inv1-inv25. Three additional homologs havesubsequently been retrieved and given temporary designations ofinv26-inv28.

Based on our data, we knew that the gene product that is involved ininverting the polysaccharide promoters would be conserved in allstrains. Therefore, DNA hybridizations were performed to probe acollection of B. fragilis strains with internal portions of each ofthese 25 ORFs.

Seven of the original twenty-five ORFs (and nine of the twenty-eightORFs) were found to be conserved within B. fragilis. Analysis of thesequence surrounding one of these conserved genes, inv19, demonstrated adivergently transcribed gene of the serine site-specific recombinasefamily which was designated res02 (now also denoted mpi) and was alsoshown by hybridization to be conserved among B. fragilis.

Fifteen out of fifteen B. fragilis strains tested probed positive withinternal portion of res02 (Table 3). Twenty-five out of twenty-six B.fragilis strains tested probed positive with internal portion of inv19;the one negative strain was 12791551 (Table 4). TABLE 3 res02probe-positive B. fragilis strains. 1279-2 45703 CM3 13141 B110 CM1117905 B117 CM12  2429 B124 PA5 26877 B272 US398

TABLE 4 inv19 probe-positive B. fragilis strains. 12775L1II 1285531I 2429 B117 CM11 1277810I 1287245I 17905 B124 CM12 1279-2 12905-23V 26877B272 IL89375II 1281262I 1291662III 45703 B356772I PA5 1284-2 13141 B110CM33 US398

Several of these conserved genes were cloned into a vector along withthe PSA promoter invertible region and analyzed for their ability toinvert the PSA promoter region. Res02 was demonstrated to bring aboutinversion of the PSA promoter region.

Example 3 Method of Deleting res02 from the B. fragilis 9343 Chromosome

In order to delete the res02 gene from the B. fragilis 9343 chromosome,homologous recombination with a double crossover event was employed toreplace the full length copy of res02 with a deleted copy. To do this,plasmid pKGW10 was created. In order to construct this plasmid, 9343chromosomal DNA was used as a template to amplify the DNA flanking theregion to be deleted (FIG. 5). Primer pairs inv19D-1 plus inv19D-2 andinv19D-5 plus inv19D-6 were used in two separate PCRs. Table 5 showsthat primers inv19D-1 and inv19D-6 each incorporate a BamHI restrictionendonuclease site (shown underlined), while primers inv19D-2 andinv19D-6 each incorporate an NcoI restriction endonuclease site (shownunderlined). TABLE 5 Sequences of the primers used to make the res02deletion. inv19-D1 → 5′-ccggatccagtactgataact (SEQ ID NO:11)ccggtgactcc-3′ inv19-D2

5′-atccatggccggtttatgaaa (SEQ ID NO:12) acgatgtatta-3′ inv19-D5 →5′-cgccatggttttccgtactta (SEQ ID NO:13) ctctcaaataagc-3′ inv19-D6

5′-ggggatccatgacatagataa (SEQ ID NO:14) tggggaagagg-3′Upon amplification, the resulting amplification products were: LeftFlank, inv19-D1→ through ←inv19-D2, 1,958 bp (FIG. 6, SEQ ID NO:5); andRight Flank, inv19-D5→ through ←inv19-D6, 2,540 bp (FIG. 7, SEQ IDNO:6).

These primers were placed so that 534 bp of the 594 bp res02 openreading frame were deleted. The resulting PCR products were digestedwith BamHI and NcoI (restriction sites constructed into the primers) andligated in a three-way reaction with a B. fragilis suicide vector thatencoded for erythromycin resistance. This ligation mixture wastransformed into E. coli and the transformants were tested by both PCRand plasmid digestion for the correct ligation of the flanks. Theplasmid containing the correct ligation of the flanks was named pKGW10.

pKGW10 was conjugally transferred from E. coli to B. fragilis 9343 andthe cointegrate (containing all of the DNA of pKGW10 integrated into thechromosome) resulting from homologous recombination was selected byerythromycin resistance (encoded by pKGW10). The cointegrate waspassaged in nonselective medium to allow for resolution of thecointegrate by the second recombination event (where either the mutantcopy or the wild type copy of res02 was lost along with the interveningplasmid). Bacteria were plated onto medium without antibiotics(approximately 200 colonies per plate). The colonies were replica platedto medium containing erythromycin and the erythromycin-sensitive(Em^(s)) colonies were selected (these Em^(s) colonies represent thosethat only have the mutant or wild type copy of res02, but not both).These colonies were screened by PCR for those containing the mutantgenotype. PCR was used to determine that each of the sevenpolysaccharide invertible promoter regions was locked and unable toinvert in each of the res02 mutants. Deletion was also confirmed bySouthern blot in each case.

Example 4 Monospecific Antisera for PSA-PSH

Monospecific antisera for each of the eight known capsularpolysaccharides of B. fragilis were prepared as previously described.Comstock L E et al. (1999) Infect Immun 67:3525-32; Coyne M J et al.(2000) Infect Immun 68:6176-81; Coyne M et al. (2001) Infect Immun69:4342-50; Krinos C M et al. (2001) Nature 414:555-8. Thesemonospecific antisera were used in Western immunoblot phenotype analysesof res02 mutants.

Example 5 Deletion of res02 Locks Capsular Polysaccharide Promoters Onor Off

The res02 open reading frame was deleted from the 9343 chromosome bydouble crossover allelic exchange, resulting in several mutants that allhad a chromosomal deletion of res02. Analysis of these mutantsdemonstrated that each of the seven polysaccharide promoter inversionregions was locked in a single orientation, demonstrating that the Res02product is involved in the DNA inversions.

Several of these mutants had the PSA and PSE promoter regions locked onand the promoter regions for PSB, PSD, PSF, PSG, and PSH locked off.Therefore, it was expected that this strain would constitutively expressPSA, PSC (the promoter of PSC does not undergo inversions) and PSE, butnot express PSB, PSD, PSF, PSG, or PSH.

FIG. 8 shows the results of phenotypic analysis of one of these mutants,9343res02mut44 (mut44). This analysis demonstrated that mut44 indeedsynthesized PSA in high quantities (as least ten times more purified PSAis isolated compared to a typical grow-up with wild-type). It was alsoshown that mut44 was unable to produce six of the other seven capsularpolysaccharides, including PSC and PSE (as determined by Western blotanalysis, immunoelectrophoresis, and analysis of purified capsularpolysaccharide). It was unexpectedly found that, despite the fact thatthe promoter for the polysaccaharide biosynthsis locus of PSF was turnedoff, mut44 expressed small amounts of PSF (FIG. 8). This observationcould be consistent with the existence of a secondary promoter that isunaffected by DNA inversions. However, another mutant, mut8, wasdiscovered to express only PSH and no PSF (FIG. 8). The small amount ofPSF expressed by mut 44 does not significantly interfere with thepurification of PSA from this strain.

Another res02 mutant, 9343res02mut2 (mut2), was isolated that also hadall of the polysaccharide promoter flip regions locked off except forthe PSA promoter region, which was locked on. The phenotype of thismutant was found to be the same as mut44, i.e., found to overexpress PSAand to express small amounts of PSF, but none of the other seven knowncapsular polysaccharides PSB-PSH.

Example 6 mut44 Overexpresses PSA

By deletion of the open reading frame designated res02, B. fragilisstrains have been created that overexpress PSA compared to wild type andthat are devoid of most or all the other seven known capsularpolysaccharide of this strain. These strains make a sufficient amount ofPSA to be easily purified to make them attractive for large scalepurification of the potent zwitterionic polysaccharide, PSA, forcommercial interests.

The yield of total capsular polysaccharide (all eight polysaccharidestogether) isolated from wild type B. fragilis NCTC9343 ranges from6.26-21.9 mg/liter of culture. After extensive methods for purificationof PSA from the other polysaccharides, the yield of pure PSA isolatedfrom wild type B. fragilis NCTC9343 ranges from 0-3.1 mg/liter ofculture, with an average of 1.56 mg/liter of culture. In the first largegrow-up (16 liters) of mut44 there was a yield of 21.8 mg pure PSA perliter of culture. Not only was the yield 14 times greater than from wildtype, but the extraction and purification methods were much easierbecause the PSA did not have to be isolated from the other capsularpolysaccharides. Unlike wild type, the yield of PSA from this mutant isexpected to be consistently high since expression of PSA is no longerundergoing phase variation due to the promoter being unable to flip off.

Example 7 Large Scale Purification of PSA from mut44

A 16 liter batch fermentation of mut44 was grown in supplemented basalmedium. Following overnight growth, the bacteria were pelleted andresuspended in 667 ml of dH₂O at 68° C. Glass beads were added to theresuspended pellet and placed in a 4 liter water bath. 667 ml of 75%phenol (prewarmed to 68° C.) was added to the mixture and stirred for atleast 30 min. The mixture was then stirred overnight in the cold room.The mixture was then centirfuged at 8000 rpm for 20 min. The top aqueousphase was removed (approx. 500 ml) and added to 500 ml of ether in aseparatory funnel. The mixture was shaken and allowed to separate for 20min. The botton phase was retained. The sample was placed in a rotaryevaporator in a 60° C. waterbath for ether evaporation and sampleconcentration. The sample was placed in dialysis tubing and dialyzedagainst 10 liters of water with six changes of buffer. 1M Tris was addedto make it 6.5% of the volume. MgCl₂ and CaCl₂ were added to a finalconcentration of 20 mM. RNase was added to bring the concentration to3.33 mg/ml and DNase was added to 0.07 mg/ml. The sample was digestedovernight at 37° C. The pH of the sample was adjusted to 7.5 with 10MNaOH. Pronase was added to 0.33 mg/ml and incubated overnight. Freshpronase was added and incubated for an additional 2 hours. EDTA wasadded to make the solution 50 mM and mixed for 30 min. The sample wasprecipitated using 5 volumes of −20° C. ethanol. The sample wasresuspended in 3% deoxycholate and applied to a sepharose 400 column.Column fractions were monitored by silver-stained SDS-PAGE forseparation of high molecular weight capsular polysaccharide and lowmolecular weight LPS. PSA purity was tested by immunoelectrophoresis andNMR analysis.

Example 8 Deletion of inv19

The inv19 open reading frame was deleted from the 9343 chromosome bydouble crossover allelic exchange, resulting in several mutants that allhave a chromosomal deletion of inv19.

Biological Deposit

A deposit of mut44 was made with American Type Culture Collection,Manassas, Va. on Mar. 11, 2002, under the description Bacteroidesfragilis: 9343res02mut44, and assigned to Patent Deposit DesignationPTA-4135.

All of the references, patents and patent publications identified orcited herein are incorporated, in their entirety, by reference.

Although this invention has been described with respect to specificembodiments, the details of these embodiments are not to be construed aslimitations. Various equivalents, changes and modifications may be madewithout departing from the spirit and scope of this invention, and it isunderstood that such equivalent embodiments are part of this invention.

1-28. (canceled)
 29. A population of bacterial cells stably expressing aspecific capsular polysaccharide selected from the group consisting of:PSA, PSB, PSD, PSE, PSF, PSG, and PSH.
 30. The population of bacterialcells according to claim 29, wherein the specific capsularpolysaccharide is PSA.
 31. The population of bacterial cells accordingto claim 29 or claim 30, wherein the bacterial cells are B. fragilis.32. The population of bacterial cells according to claim 29 or claim 30,wherein the bacterial cells are B. fragilis NCTC9343.
 33. The populationof bacterial cells according to claim 29 or claim 30, wherein thebacterial cells are B. fragilis 9343res02mut44.
 34. The population ofbacterial cells according to claim 29 or claim 30, wherein the bacterialcells are B. fragilis 9343res02mut2. 35-65. (canceled)
 66. Thepopulation of bacterial cells according to claim 30, wherein thebacterial cells are B. fragilis.
 67. The population of bacterial cellsaccording to claim 30, wherein the bacterial cells are B. fragilisNCTC9343.
 68. The population of bacterial cells according to claim 30,wherein the bacterial cells are B. fragilis 9343res02mut44.
 69. Thepopulation of bacterial cells according to claim 30, wherein thebacterial cells are B. fragilis 9343res02mut2.